Mammalian proteinases; related reagents and methods

ABSTRACT

Nucleic acids encoding various proteases, from a mammal, reagents related thereto, including specific antibodies, and purified proteins are described. Methods of using said reagents and related diagnostic kits are also provided.

This filing is a continuation application of commonly assigned, U.S.Pat. No. 09/005,263, filed Jan. 9, 1998, now abandoned, which isincorporated herein by reference.

FIELD OF THE INVENTION

The present invention contemplates compositions related to proteins fromanimals, e.g., mammals, which function as proteinases. In particular, itprovides nucleic acids which encode the proteinases, antibodies to, andproteins which exhibit biological functions, e.g., capacity to degradeproteinaceous substrates.

BACKGROUND OF THE INVENTION

The proteases are a very broad group of enzymes which carry out anenzymatic function of hydrolyzing a peptide bond. See, e.g., Beynon (ed.1989) Proteolytic Enzymes: A Practical Approach IRL Press, Oxford;Methods in Enzymology vols. 244 and 248. Within the group, there is awide range of substrate specificities for the amino acids adjacent thecleavage sites. Proteases are typically categorized on the basis oftheir catalytic mechanisms, e.g., based upon studies of their activesites, or by the effects of pH. Four main categories of proteases areserine proteinases, sulfhydryl proteases, acid proteases, andmetalloproteases. They may also be classified according to their sitesof substrate cleavage, e.g., endoproteases, amino peptidases, or carboxypeptidases.

Proteases have traditionally held a large share of the industrial enzymemarket. Proteases are used in many industrial processes, including indetergents and cleaning products, e.g., to degrade protein materialssuch as blood and stains, in leather production, e.g., to remove hair,in baking, e.g., to break down glutens, in flavorings, e.g., soy sauce,in meat tenderizing, e.g., to break down collagen, in gelatin or foodsupplement production, in the textile industry, in waste treatment, andin the photographic industry. See, e.g., Gusek (1991) Inform 1:14-18;Zamost, et al. (1996) J. Industrial Microbiol. 8:71-82; James andSimpson (1996) CRC Critical Reviews in Food Science and Nutrition36:437-463; Teichgraeber, et al. (1993) Trends in Food Science andTechnology 4:145-149; Tjwan, et al. (1993) J. Dairy Research 60:269-286;Haard (1992) J. Aquatic Food Product Technology 1:17-35; van Dijk (1995)Laundry and Cleaning News 21:32-33; Nolte, et al. (1996) J. TextileInstitute 87:212-226; Chikkodi, et al. (1995) Textile Res. J.65:564-569; and Shih (1993) Poultry Science 72:1617-1620.

Matrix metalloproteinases (MMPs) are a family of enzymes whose mainphysiological function is degradation of the extracellular matrix. See,e.g., Parsons, et al. (1997) Br. J. Surgery 84:160-166. These enzymesare present in normal healthy individuals and have been shown to have animportant role in processes such as wound healing (see Wolf, et al.(1992) J. Invest. Dermatol. 99:870-872; and Wysocki, et al. (1993) J.Invest. Dermatol. 101:64-68), pregnancy and parturition (see Jeffrey(1991) Seminars Perinatol. 15:118-126), bone resorption (see Delaisseand Vaes, pp. 290-314 in Rifkin and Gay (eds. 1992) Biology andPhysiology of the Osteoclast CRC Press, Ratan, Fla.), and mammaryinvolution (Talhouk, et al. (1992) J. Cell Biol. 118:1271-1282). Seealso Nagase (1996) in Hooper (ed.) Zinc Metalloproteinases in Health andDisease Taylor and Francis, London. A recent focus on the MMPs is ontheir role in certain disease states in which breakdown of theextracellular matrix is a key feature, e.g., in rheumatoid arthritis(see Harris (1990) NEJ Med. 322:1277-1289), periodontal disease (seePage (1991) J. Periodont. Res. 26:230-242), and cancer (see Brown (1997)Medical Oncology 14:1-10; Chambers and Matisian (1997) J. NCI89:1260-1270; Yu, et al. (1997) Drugs and Aging 11:229-244; Yu, et al.(1997) Clinical Pharmacology 11:229-244; Wojtowicz-Praga, et al. (1997)Invest. New Drugs 15:61-75; Coussens and Werb (1996) Chem. Biol.3:895-904; and Talbot and Brown (1996) Eur. J. Cancer 32A:2528-2533).

While there are many uses for proteases, there is always the need for amore active or specific protease under various specific conditions.Alternatively, the distribution of these gene products may be useful asmarkers for specific cell or tissue types. There is a need for newproteinases of differing properties, specificities, and activities.

SUMMARY OF THE INVENTION

In a search for DC restricted molecules, a novel member of the MMPfamily of proteolytic enzymes was identified which belongs to theMembrane-type Matrix Metalloproteinase (MT-MMP) subclass. This fifthMT-MMP proteinase, located on chromosome 16p13.3, is present in spleen,lymph node, thymus, appendix, PBL, and bone marrow, and stronglyexpressed by DC and weakly by granulocytes and effector T cells.Interestingly, the mRNA expression of this gene is down-regulated byCD40L activation of CD34⁺- and monocyte-derived DC. According to itscellular expression and putative membrane localization, a role isproposed for this novel Membrane-type Matrix Metalloproteinase gene indegradation of the extracellular matrix during DC migration.

The present invention provides a binding compound comprising an antibodybinding site which specifically binds to primate F06B09 protein; anucleic acid comprising sequence encoding at least 12 amino acids ofprimate F06B09 protein; a substantially pure protein which isspecifically recognized by the above antibody binding site; asubstantially pure primate F06B09 protein or peptide thereof; and afusion protein comprising a 30 amino acid sequence portion of primateF06B09 protein sequence.

In certain binding compound embodiments, the antibody binding site isspecifically immunoreactive with a protein selected from polypeptides ofSEQ ID NO: 4; is raised against a purified or recombinantly producedprimate F06B09 protein; is immunoselected on a substantially purified orrecombinantly produced primate F06B09 protein; is in a monoclonalantibody, Fab, or F(ab)2; is detectably labeled; is attached to a solidsubstrate; is from a rabbit or mouse; binds with a Kd of at least about300 μM; is fused to another protein segment; is in a chimeric antibody;or is coupled to another chemical moiety.

The invention also provides a method of making an antigen-antibodycomplex, comprising a step of contacting a primate biological sample toa specific binding antibody described. In preferred embodiments, themethod further includes steps to purify the antigen or antibody.

Alternative embodiments provide an antibody binding site wherein thebinding site is detected in a biological sample by a method comprisingthe steps of contacting a binding agent having an affinity for F06B09protein with the biological sample; incubating the binding agent withthe biological sample to form a binding agent:F06B09 protein complex;and detecting the complex. In certain embodiments, the biological sampleis human, and the binding agent is an antibody.

The invention also provides kits containing a composition describedabove and instructional material for the use of the composition; orsegregation of the composition into a container. Typically, the kit isused to make a qualitative or quantitative analysis.

The invention also embraces a cell comprising an antibody describedabove; a cell transfected with a nucleic acid described above; or a cellcomprising a fusion protein described above.

In nucleic acid embodiments, the nucleic acid may encode a polypeptidewhich specifically binds an antibody generated against an immunogenselected from the group consisting of the mature polypeptides of SEQ IDNO: 4. Alternatively, the nucleic acid may encode at least 12 aminoacids of SEQ ID NO: 4; comprise sequence of at least about 39nucleotides selected from protein coding portions of SEQ ID NO: 1 or 3;hybridize to SEQ ID NO: 1 or 3 under stringent wash conditions of atleast 45° C. and less than about 150 mM salt; comprise sequence made bya synthetic method; be an expression vector; be detectably labeled; beattached to a solid substrate; be from human; bind with a Kd of at leastabout 300 μM; be fused to another nucleic acid segment; be coupled toanother chemical moiety; be operably associated with promoter, ribosomebinding site, or poly-A addition site; be a PCR product; be transformedinto a cell, including a bacterial cell; be in a sterile composition; becapable of selectively hybridizing to a nucleic acid encoding an F06B09protein; comprise a natural sequence; comprise a mature protein codingsegment of SEQ ID NO: 1 or 3; encode proteolytically active portion ofF06B09; be detected in a biological sample by a method comprising:contacting a biological sample with a nucleic acid probe capable ofselectively hybridizing to said nucleic acid, incubating the nucleicacid probe with the biological sample to form a hybrid of the nucleicacid probe with complementary nucleic acid sequences present in thebiological sample; and determining the extent of hybridization of thenucleic acid probe to the complementary nucleic acid sequences,including the method where the nucleic acid probe is capable ofhybridizing to a nucleic acid encoding a protein selected from the groupconsisting of the mature polypeptides of SEQ ID NO 4.

In protein or polypeptide embodiments, the proteins may bind with a Kdof at least about 300 μM to an antibody generated against an immunogenof the polypeptides of SEQ ID NO: 4; be immunoselected on an antibodywhich selectively binds a polypeptide of SEQ ID NO: 4; comprise sequenceof at least 12 contiguous residues of SEQ ID NO: 4; exhibit apost-translational modification pattern distinct from natural F06B09; be3-fold or fewer substituted from natural sequence; be recombinantlyproduced; be denatured; have sequence of full length naturalpolypeptide; be detectably labeled; be attached to a solid substrate; befrom human; be in a sterile composition; be fused to another proteinsegment; be coupled to another chemical moiety; comprise at least afragment of at least 32 amino acid residues from a human F06B09 protein;comprise mature polypeptide sequence selected from the group consistingof SEQ ID NO 4 and 4; be a soluble protein; be a naturally occurringprotein; or be a proteolytically active portion of F06B09.

The invention also provides an isolated protein which specifically bindsto an antibody generated against an immunogen selected from the groupconsisting of the full length polypeptides of SEQ ID NO: 4. Preferablysuch protein binds to the antibody with a Kd of at least about 300 μM;is immunoselected on an antibody which selectively binds a polypeptideof SEQ ID NO: 4; comprises sequence of at least 12 contiguous residuesof SEQ ID NO: 4; exhibits a post-translational modification patterndistinct from natural F06B09; is 3-fold or fewer substituted fromnatural sequence; is recombinantly produced; is denatured; has sequenceof full length natural polypeptide; is detectably labeled; is attachedto a solid substrate; is from human; is in a sterile composition; isfused to another protein segment; is coupled to another chemical moiety;comprises at least a fragment of at least 32 amino acid residues from ahuman F06B09 protein; comprises mature polypeptide sequence selectedfrom the group consisting of SEQ ID NO 4; is a soluble protein; orcomprises a proteolytic activity of F06B09.

In certain other embodiments, the invention embraces a fusion proteindescribed above, which comprises sequence from an enzymatically activeportion of SEQ ID NO: 4. Preferably such protein binds with a Kd of atleast about 300 μM to an antibody generated against an immunogen havingsequence of a polypeptide of SEQ ID NO. 4; is immunoselected on anantibody which selectively binds a polypeptide of SEQ ID NO: 4;comprises sequence of at least 12 contiguous residues of SEQ ID NO: 4;is recombinantly produced; is denatured; has sequence of full lengthnatural polypeptide; is detectably labeled; is attached to a solidsubstrate; comprises sequence from human; is in a sterile composition;is fused to another protein segment; is coupled to another chemicalmoiety; comprises at least a fragment of at least 32 amino acid residuesfrom a human F06B09 protein; comprises mature polypeptide sequence fromSEQ ID NO 4; is a soluble protein; or comprises a proteolytic activityof F06B09.

The invention also provides a substantially pure protein described abovewhich comprises a proteolytic activity of F06B09.

A method of modulating physiology or development of a cell comprisingcontacting said cell with said compositions is provided.

DETAILED DESCRIPTION

OUTLINE

I. General II. Definitions III. Nucleic Acids IV. Making F06B09 ProteinV. Antibodies; binding compounds a. antibody production b. immunoassaysVI. Purified F06B09 Protein VII. Physical Variants VIII. Binding Agent:F06B09 Protein Complexes IX. Functional Variants X. Uses XI. Kits XII.Substrate Identification

I. General

Dendritic cells (DC), present in all lymphoid and non lymphoid organs,are professional antigen presenting cells (APC) which have the uniquecapacity to activate naive T cells. See, e.g., Banchereau and Steinman(1998) Nature 392:245-252; and Steinman (1991) Annu. Rev. Immunol.9:271-296. DC, originated from bone-marrow, migrate as precursorsthrough bloodstream to non lymphoid tissues where, at immature stage, DCsuch as the epidermal Langerhans cells capture antigens with highefficiency and become circulating veiled cells. These cells bearingantigens migrate from the peripheral non lymphoid tissues via lymphaticsor bloodstream into lymphoid tissues where they localize in T cell-richareas as mature interdigitating DC (IDC). See, e.g., Austyn (1996) J.Exp. Med. 183:1287-1292; Austyn, et al. (1988) J. Exp. Med. 167:646-651;Fossum (1988) Scand. J. Immunol. 27:97-105; Hoefsmit, et al. (1982)Immumobiology. 161:255-265; Kripke, et al. (1990) J. Immunol.145:2833-2838; Larsen, et al. (1990) J. Exp. Med. 172:1483-1494;Macatonia, et al. (1987) J. Exp. Med. 166:1654-1667; Romani, et al.(1989) J. Exp. Med. 169:1169-1178. At this site, IDC efficiently presentprocessed Ags to naive T cells and generate a specific immune response.See, e.g., Inaba, et al. (1983) Proc. Natl. Acad. Sci. USA.80:6041-6045; and Inaba and Steinman (1985) Science. 229:475-479. Thus,migration constitutes an integral part of DC function.

The recruitment of DC into a site of tissue damage and the subsequentmigration of DC into secondary lymphoid organs is dependent upon adynamic and complex series of events, including activation byinflammatory stimuli. See, e.g., Butcher (1991) Cell. 67:1033-1036. Thismechanism, implies transendothelial migration beyond the vascularcompartment involving the expression of integrin molecules, the movementalong leukocyte specific chemotactic gradients (Taub (1996) CytokineGrowth Factor Rev. 7:355-376) and possibly the secretion ofmatrix-degrading enzymes (Watanabe, et al. (1993) J. Cell Sci.104:991-999). In addition, the trafficking of DC into tissues involvesbreaching the basement membrane (dermo-epidermic junction), which wouldnecessitate the production of a matrix-degrading degrading enzyme.

Matrix metalloproteinases, or matrixins, represent a group ofstructurally related zinc-dependent endopeptidases that are involved inextracellular matrix and basement membrane degradation and cell-matrixinteractions. See, e.g., Basbaum and Werb (1996) Curr. Opin. Cell Biol.8:731-738; Birkedal-Hansen, et al. (1993) Crit. Rev. Oral Biol. Med.4:197-250; Mignatti and Rifkin (1993) Physiol. Rev. 73:161-195; andStetler-Stevenson, et al. (1993) Annu. Rev. Cell Biol. 9:541-573. Theyplay crucial roles in tissue remodeling in normal and pathologicalprocesses, including development, repair, and cancer progression. AllMMPs identified to date are synthesized as an inactive proenzyme form orzymogen, contain zinc-binding sites, and need proteqlytic activation tobecome functional proteases. According to their structural features andsubstrate specificity, four subclasses of MMPs have been established:collagenases have the unique capacity to degrade fibrillar collagens;gelatinases denature basement membranes and denatured collagens;stromelysins degrade many extracellular proteins, includingproteoglycans, laminin, and fibronectin; and membrane-type MMPs aresupposed to have proteolytic activity on other MMPs, required for theiractivation. See Matrisian (1992) Bioessays. 14:455-463;Stetler-Stevenson, et al. (1993) Annu. Rev. Cell Biol. 9:541-573; andWoessner (1991) FASEB J. 5:2145-2154. Among the 15 identified MMPs, fourdistinct members, presenting a transmembrane domain at the C terminus,have been described and are referred to as MT-MMP (Membrane-type matrixmetalloproteinase). Sato, et al. (1994) Nature 370:61-65 firstidentified the MT1-MMP (MMP 14), responsible for the activation ofprogelatinase A (pro-MMP2) on the tumor cell surface that may triggertissue invasion by tumor cells. Three additional members of this family,MT2-MMP (MMP 15), MT3-MMP (MMP 16) and MT4-MMP (MMP 17) have beenisolated respectively from lung, placenta, and breast carcinoma cDNAlibraries. See Puente, et al. (1996) Cancer Res. 56:944-949; Takino, etal. (1995) J. Biol. Chem. 270:23013-23020; and Will and Hinzmann (1995)Eur. J. Biochem. 231:602-608. All MT-MMPs, like the stromelysin-3 MMP(Basset, et al. (1990) Nature 348:699-704), contain a consensusinsertion of about ten amino acids (RxK/RR) between the propeptide andthe catalytic domain, corresponding to potential cleavage sites byenzymes called furin (Basbaum and Werb (1996) Curr. Opin. Cell Biol.8:731-8; Sang and Douglas (1996) J. Protein Chem. 15:137-160). Thiscleavage is necessary to give rise to an activated form of MT-MMPs. TheMT-MMPs are located in ternary complexes including a substrate, a tissueinhibitor of MMPs (TIMPs), and an activated MT-MMP, associated with theplasma membrane (Stetler-Stevenson, et al. (1993) Annu. Rev. Cell Biol.9:541-573). As described for MT1-, MT2-, and MT3-MMPs, MT-MMPs may havea proteolytic activity on other MMPs like gelatinase A (pro-MMP2) andcollagenase-3 (MMP 13) (see Butler, et al. (1997) Eur. J. Biochem.244:653-657; Knauper, et al. (1996) J. Biol. Chem. 271:17124-17131;Kolkenbrock, et al. (1997) Biol. Chem. 378:71-76; Sato, et al. (1994)Nature 370:61-65; Strongin, et al. (1993) J. Biol. Chem.268:14033-14039; and Takino, et al. (1995) J. Biol. Chem.270:23013-23020).

The present invention provides DNA sequences encoding mammalian proteinswhich exhibit structural properties or motifs characteristic of aprotease, more particularly a matrix metalloproteinase. The proteinsdescribed herein are designated F06B09. See Table 1.

Through an effort aiming at the identification of human dendritic cells(DC) specific genes, the cDNA coding for a fifth member of the humanMembrane-type Matrix Metalloproteinases (MT-MMP) family has been cloned.The full-length 3691 bp cDNA which was mapped on chromosome 16p13.3,contains an open reading frame of some 1689 bp, encoding a 562 aminoacid protein. The predicted protein was most homologous (48% amino acidhomology) with the human matrix metalloproteinase MT4-MMP and has thetypical features of member of the MMP family, including a prodomain withthe activation locus, the zinc binding site, and the hemopexin domain.

The general roles of matrix metalloproteases are described above. Thespecific interaction of matrix metalloproteinases with other proteins,e.g., furin and progelatinase, are described in Basbaum and Werb (1996)Current Opinion in Cell Biology 8:731-738. Matrix metalloproteases aretypically zinc endopeptidases that are required for the degradation ofextracellular matrix components during normal embryo development,morphogenesis, and tissue remodeling. Their proteolytic activities areprecisely regulated by endogenous tissue inhibitors of metalloproteases(TIMPS). Disruption of this balance results in diseases such asarthritis, atherosclerosis, and tumor growth and metastasis. Nagase(1996) in Hooper (ed.) Zinc Metalloproteinases in Health and DiseaseTaylor and Francis, London; Coussens and Werb (1996) Chem. Biol.3:895-904. Therefore, F06B09 gene product could play a role in themigration of the dendritic cells (DC) or in the progression of thedendrites between the stromal cells. The way the MMPs act on the matrixis complex. The MMP is typically produced as an inactive proenzyme thatneeds to be processed by another protease, most probably furin, sinceF06B09 contains a site of cleavage for this convertase. This processprobably occurs intracellularly (Basbaum and Werb (1996) Current Opinionin Cell Biology 8:731-738), and is likely followed by an interaction ofF06B09 with other proteases like progelatinase.

A single 3.7 Kb mRNA transcript of this gene was found to be mainlyexpressed in CD34⁺-derived human DC and also weakly in in vitrogenerated granulocytes. No signal was detected in TF1, CHA, Jurkat,MRC5, or U937 cell lines, nor in freshly isolated monocytes, activated Tand B cells, and activated peripheral blood lymphocytes (PBLs). Amongnormal adult human tissues, this mRNA was detected in spleen, lymphnode, thymus, appendix, and bone marrow, but no expression was found infetal tissues. RT-PCR distribution analysis showed a significantexpression of the novel MT-MMP in activated DC and weakly in JY B cellline. Interestingly, it was found that the novel MT-MMP mRNA expressionwas down-regulated upon DC activation with CD40L. The expression patternof this gene, which is predominantly expressed by DC, together with itsputative membrane localization, suggest that it could be involved in thedegradation of the extracellular matrix during DC migration.

The descriptions below are directed, for exemplary purposes, to primateembodiments, e.g., human, but are likewise applicable to relatedembodiments from other, e.g., natural, sources. Other ESTs have beenidentified from rodent cDNA libraries. These sources should, whereappropriate, include various vertebrates, typically warm bloodedanimals, e.g., birds and mammals, particularly domestic animals, andprimates. The sequences exhibit significant similarity to membrane-typematrix metalloproteases MT-MMP1 to 4. Table 2 shows an alignment of thefamily members.

TABLE 1 Human F06B09 nucleotide and predicted amino acid sequence. SEQID NO: 1 and 2. Predicted signal sequence/cleavage is indicated.Predicted extracellular domain about 1-527; transmembrane segment about528-543; cytoplasmic domain about 544-545. CATGCAACAT AATCTTGCTCGATTCTAAAG TCAACGGATC CTGCAAAATT CGCGGCCGCG 60 TCAACCCATT AGGTCTTGGCCTTGGAATAA AATTGCTTCT CGTCTGATTC CCGGGCCCAC 120 CCGACCCAGC GGCGCAACCCTGGCCCTCCG GGACCCTCCG CTGACTCCAC CGCGCACTTC 18O CCGGGACCCC CACACACATCCCAGCCCTCC GGCCGATCCC TCCCTACTCG GTGCCGGGTG 240 CCCCCCTTTT TTTTCTAGGCCCGGATCTCC TCCCCCAGGT CCCCGGGGCG GCCCCAACCA 300 GGCCCCCTTC AAACCCCGCCGGCGGCCCGG GCTGGGGCGC ACC ATG CGG CTG CGG 355                                                Met Arg Leu Arg                                                -18         -15 CTC CGGCTT CTG GCG CTG CTG CTT CTG CAT GCT GGC ACC GCC CGC GCG 403 Leu Arg LeuLeu Ala Leu Leu Leu Leu His Ala Gly Thr Ala Arg Ala                -10                  -5                   1 CGC CCC GAAGCC CTC GGC GCA GGA CTT AGC CTG GGC TGT GAG AAC TGG 451 Arg Pro Glu AlaLeu Gly Ala Gly Leu Ser Leu Gly Cys Glu Asn Trp          5                  10                  15 CTG ACT CGC TAT GGTTAC CTA CCG CCA CCC GAC CCT GCC CAG GCC CAG 499 Leu Thr Arg Tyr Gly TyrLeu Pro Pro Pro Asp Pro Ala Gln Ala Gln     20                  25                  30 CTG CAG AGC CCT GAA AATTTG CGC GAT GCC ATC AAA GTC ATG CAA AGG 547 Leu Gln Ser Pro Glu Asn LeuArg Asp Ala Ile Lys Val Met Gln Arg 35                  40                  45                  50 TTC GCGGGG CTG CCG GAG ACC GGC CGC ATG GAC CCA GGG ACA GTG GCC 595 Phe Ala GlyLeu Pro Glu Thr Gly Arg Met Asp Pro Gly Thr Val Ala                 55                  60                  65 ACC ATG CGTAAG CCC CGC TGC TCC CTG CCT GAC GTG CTG GGG GTG GCG 643 Thr Met Arg LysPro Arg Cys Ser Leu Pro Asp Val Leu Gly Val Ala             70                  75                  80 GGG CTG GTC AGGCGG CGT CGC CGG TAC GGT CTG AGC GGC AGC GTG TGG 691 Gly Leu Val Arg ArgArg Arg Arg Tyr Gly Leu Ser Gly Ser Val Trp         85                  90                  95 GAG AAG CGA ACC GTGACA TGG AGG GTA CGT TCC TTC CCC CAG AGC TCC 739 Glu Lys Arg Thr Val ThrTrp Arg Val Arg Ser Phe Pro Gln Ser Ser    100                 105                 110 CAG GTG AGC CAG GAG ACCGTG CGG GTC CTC GTG AGC TAT GCC CTG ATG 787 Gln Val Ser Gln Glu Thr ValArg Val Leu Val Ser Tyr Ala Leu Met115                 120                 125                 130 GCG TGGGGC ATG GAG TCA GGC CTC ACA TTT CAT GAG GTG GAT TCC CCC 835 Ala Trp GlyMet Glu Ser Gly Leu Thr Phe His Glu Val Asp Ser Pro                135                 140                 145 CAG GGC CAGGAG CCC GAC ATC CTC ATA GAC TTT GCC CGC GCC TTC CAA 883 Gln Gly Gln GluPro Asp Ile Leu Ile Asp Phe Ala Arg Ala Phe Gln            150                 155                 160 CAG GAC AGC TACCCC TTC GAC GGG TTG GGG GGC ACC CTA GCC CAT GCC 931 Gln Asp Ser Tyr ProPhe Asp Gly Leu Gly Gly Thr Leu Ala His Ala        165                 170                 175 TTC TTC CCT GGG GAGCAC CCC ATC TCC GGG GAC ACT CAC TTT GAC GAT 979 Phe Phe Pro Gly Glu HisPro Ile Ser Gly Asp Thr His Phe Asp Asp    180                 185                 190 GAG GAG ACC TGG ACT TTTGGG TCA AAA GAC GGC GAG GGG ACC GAC CTG 1027 Glu Glu Thr Trp Thr Phe GlySer Lys Asp Gly Glu Gly Thr Asp Leu195                 200                 205                 210 TTT GCCGTG GCT GTC CAT GAG TTT GGC CAC GCC CTG GGC ATG GGC CAC 1075 Phe Ala ValAla Val His Glu Phe Gly His Ala Leu Gly Met Gly His                215                 220                 225 TCC TCA GCCCCC GAC TCC ATT ATG AGG CCC TTC TAC CAG GGT CCG GTG 1123 Ser Ser Ala ProAsp Ser Ile Met Arg Pro Phe Tyr Gln Gly Pro Val            230                 235                 240 GGC GAC CCT GACAAG TAC CGC CTG TCT CTG GAT GAC CGC GAT GGC CTG 1171 Gly Asp Pro Asp LysTyr Arg Leu Ser Leu Asp Asp Arg Asp Gly Leu        245                 250                 255 CAG CAA CTC TAT GGGAAG GCG CCC CAA ACC CCA TAT GAC AAG CCC ACA 1219 Gln Gln Leu Tyr Gly LysAla Pro Gln Thr Pro Tyr Asp Lys Pro Thr    260                 265                 270 AGG AAA CCC CTG GCT CCTCCG CCC CAG CCC CCG GCC TCG CCC ACA CAC 1267 Arg Lys Pro Leu Ala Pro ProPro Gln Pro Pro Ala Ser Pro Thr His275                 280                 285                 290 AGC CCATCC TTC CCC ATC CCT GAT CGA TGT GAG GGC AAT TTT GAC GCC 1315 Ser Pro SerPhe Pro Ile Pro Asp Arg Cys Glu Gly Asn Phe Asp Ala                295                 300                 305 ATC GCC AACATC CGA GGG GAA ACT TTC TTC TTC AAA GGC CCC TGG TTC 1363 Ile Ala Asn IleArg Gly Glu Thr Phe Phe Phe Lys Gly Pro Trp Phe            310                 315                 320 TGG CGC CTC CAGCCC TCC GGA CAG CTG GTG TCC CCG CGA CCC GCA CGG 1411 Trp Arg Leu Gln ProSer Gly Gln Leu Val Ser Pro Arg Pro Ala Arg        325                 330                 335 CTG CAC CGC TTC TGGGAG GGG CTG CCC GCC CAG GTG AGG GTG GTG CAG 1459 Leu His Arg Phe Trp GluGly Leu Pro Ala Gln Val Arg Val Val Gln    340                 345                 350 GCC GCC TAT GCT CGG CACCGA GAC GGC CGA ATC CTC CTC TTT AGC GGG 1507 Ala Ala Tyr Ala Arg His ArgAsp Gly Arg Ile Leu Leu Phe Ser Gly355                 360                 365                 370 CCC CAGTTC TGG GTG TTC CAG GAC CGG CAG CTG GAG GGC GGG GCG CGG 1555 Pro Gln PheTrp Val Phe Gln Asp Arg Gln Leu Glu Gly Gly Ala Arg                375                 380                 385 CCG CTC ACGGAG CTG GGG CTG CCC CCG GGA GAG GAG GTG GAC GCC GTG 1603 Pro Leu Thr GluLeu Gly Leu Pro Pro Gly Glu Glu Val Asp Ala Val            390                 395                 400 TTC TCG TGG CCACAG AAC GGG AAG ACC TAC CTG GTC CGC GGC CGG CAG 1651 Phe Ser Trp Pro GlnAsn Gly Lys Thr Tyr Leu Val Arg Gly Arg Gln        405                 410                 415 TAC TGG CGC TAC GACGAG GCG GCG GCG CGC CCG GAC CCC GGC TAC CTT 1699 Tyr Trp Arg Tyr Asp GluAla Ala Ala Arg Pro Asp Pro Gly Tyr Leu    420                 425                 430 CGC GAC CTG AGC CTC TGGGAA GGC GCG CCC CCC TCC CCT GAC GAT GTC 1747 Arg Asp Leu Ser Leu Trp GluGly Ala Pro Pro Ser Pro Asp Asp Val435                 440                 445                 450 ACC GTCAGC AAC GCA GGT GAC ACC TAC TTC TTC AAG GGC GCC CAC TAC 1795 Thr Val SerAsn Ala Gly Asp Thr Tyr Phe Phe Lys Gly Ala His Tyr                455                 460                 465 TGG CGC TTCCCC AAG AAC AGC ATC AAG ACC GAG CCG GAC GCC CCC CAG 1843 Trp Arg Phe ProLys Asn Ser Ile Lys Thr Glu Pro Asp Ala Pro Gln            470                 475                 480 CCC ATG GGG CCCAAC TGG CTG GAC TGC CCC GCC CCG AGC TCT GGT CCC 1891 Pro Met Gly Pro AsnTrp Leu Asp Cys Pro Ala Pro Ser Ser Gly Pro        485                 490                 495 CGC GCC CCC AGG CCCCCC AAA GGG ACC CCC GTG TCC GAA ACC TGC GAT 1939 Arg Ala Pro Arg Pro ProLys Gly Thr Pro Val Ser Glu Thr Cys Asp    500                 505                 510 TGT CAG TGC GAG CTC AACCAG GCC GCA GGA CGT TGG CCT GCT CCC ATC 1987 Cys Gln Cys Glu Leu Asn GlnAla Ala Gly Arg Trp Pro Ala Pro Ile515                 520                 525                 530 CCG CTGCTC CTC TTG CCC CTG CTG GTG GGG GGT GTA GCC TCC CGC 2032 Pro Leu Leu LeuLeu Pro Leu Leu Val Gly Gly Val Ala Ser Arg                535                 540                 545 TGATGGGGGGAGCCATCCAG ACCGAACAGC GCCCTCCACG GCCGAGTCCC CCGCCGCTGG 2092 ACCTGGTCGGGGGTTGTGAG GCGCTGCGGA GGCCCCTTGT CTGTTCCCAC GGACGGGGGC 2152 TCGGGCGCGGACTAAGCAGG GGGGATCTCC CGCGCAGGGG CGGCGGCGGC GGGGACCGGT 2212 CGCCTGGCGCTGGGCTCAGT CTCCTCAGGG TCTGAGACCC CGGCGCTGCC ACCGGAACCC 2272 GCCTTCAGGGGCGCACGCGC GCTGGGACCA TGCGTCGGTC GTCGCCCCCG TCGTTCCCTC 2332 CCGGCTGCCGCCAGGGGGCG GTCGGACCCC GCCTCCCGAG CCCGGGGAGG GGCGGGGAGG 2392 ACAAGGGGCGGGCCCGCGGC CTCACCCGGA GGGACGGCAG CCCCGGTCGC GCGCTGGCCC 2452 CGCAGGACCTTCCTTTTCCA GGAAGAGCCA GCTTTTCTCG GAGCGCAGTC CTGGGACTCT 2512 CCGCAGCCCCGCCCCGCCTG GCCACTGCGT CTGGCATTCC TGGGTCGTTA GAGGACAGGC 2572 CTGACTGCGAAGCTGTGCCT TGCCCCTCTC CCACCCGCAG TTTCTCACCC CGTTCTGCTC 2632 CCACAAGGCCCCCCTACAGT CACTGCCACA CTGGTGGGGA CCTGGGACCC AGACCCGGAA 2692 CCAGCCCAGATATCACCCCT GAGGACCCAT GCGCCACGTC CTGGGTGGTG GAATCAGTGG 2752 GTGGAGGGACGACCCTTGCT CTCCAGGCTG TTAACCTTTT CCGTTGCTCC CCCGCCACCC 2812 ACCTCCTCCTCCCCAGGCCA CCCAACTTGG GCACCTCCCT GGGCCCAGAA CTGCCTTCCA 2872 TTCAATGGGGAACCCTTCTA TCCCCAAGAA CCCCTTCCCT GCTTGCACCC TGGAGAGAAC 2932 AGCTTGACTCCCATCAACTC AACGCTGGTG GAAAGACAGG GACCGAACCC TGGCTCAGGC 2992 CTGGTCATTGCCTCCTCAGC ACTCCCTCCT GGGAGGCCTT AGCTCTAGAG TGAGGGGTGG 3052 GTGGAACCTGGGGGCACCTC GTTCACCCTG TCCCCACTCC CCACAGTTTT AGGATCTAAA 3112 TGATTGCCTCTGGAACTATT CTTCTAGACT ATCCCACATC AGAATCACTG GGAAATTTAA 3172 GTTTGCAGATCCCACACTCA CCCTGAATCC TCACTCAGGG TGGGGTCAGG AATCTGCATT 3232 TTAACTAGTCGCGGGGATTG TGGGGGGCAG TAGCTGGCTG TTTCGTGGCA TTTCTGTGGC 3292 TCTGCAGTGTTCCTCCACCC CAGGACCAAT ATGTTCAGGC CACACCGATG GCCTGAACCC 3352 CATGGGTAGAGTCACTTAGG GGCCACTTCC TAAGTTGCTG TCCAGCCTCA GTGACCCCCT 3412 AGTGCTTCCTGGAGCTGAGG CTGTGGGCGG CTGTCCCAGC AACCAAGCGA GGGGTTGCCC 3472 CAGTTGCTCATACAAACAGA TCAGCATGAG GACAGAAGGC AGGAGACTTT GGTCAGTTAC 3532 CTGGGAATTCTGGGCTGCCA GGAAACGATT TGGGCCTCTG TCAGTTTCTT TTCCATGTAT 3592 GAGGAGGGGGAAATTTGTAT ATTAGATACT TATTCATCCC ACTCTGGACA ATAAAAACGA 3652 ATGTACAAAAAAAACATAAA AAAAAAAAAT AAAGAAAATC AAA 3695 Alternative sequence of F06B09provides (SEQ ID NO: 3 and 4). Notable motifs include: predicted signalsequence, as shown; propeptide domain from about 1-66; C switch motiffrom about 67-73; furin site from about 82-86; catalytic site from about87-211; zinc binding site from about 212-222, with notable His at 212,216, and 222; hinge region from about 260-290; hemopexin-like domainfrom about 291-525; transmembrane segment from about 526-538; andcytoplasmic tail from about 539-542. CATGCAACAT AATCTTGCTC GATTCTAAAGTCAACGGATC CTGCAAAATT CGCGGCCGCG 60 TCAACCCATT AGGTCTTGGC CTTGGAATAAAATTGCTTCT CGTCTGATTC CCGGGCCCAC 120 CCGACCCAGC GGCGCAACCC TGGCCCTCCGGGACCCTCCG CTGACTCCAC CGCGCACTTC 180 CCGGGACCCC CACACACATC CCAGCCCTCCGGCCGATCCC TCCCTACTCG GTGCCGGGTG 240 CCCCCCGCCC TCTCCAGGCC CGGATCTCCTCCCCCAGGTC CCCGGGGCGG CCCCAGCCAG 300 GCCCCCTTCG AACCCCGCCG GCGGCCCGGGCTGGGGCGCA CC ATG CGG CTG CGG 355                                               Met Arg Leu Arg                                               -18         -15 CTC CGGCTT CTG GCG CTG CTG CTT CTG CTG CTG GCA CCG CCC GCG CGC 402 Leu Arg LeuLeu Ala Leu Leu Leu Leu Leu Leu Ala Pro Pro Ala Arg        -15                 -10                  -5 GCC CCG AAG CCC TCGGCG CAG GAC GTG AGC CTG GGC GTG GAC TGG CTG 450 Ala Pro Lys Pro Ser AlaGln Asp Val Ser Leu Gly Val Asp Trp Leu      1               5                  10                  15 ACT CGCTAT GGT TAC CTG CCG CCA CCC CAC CCT GCC CAG GCC CAG CTG 498 Thr Arg TyrGly Tyr Leu Pro Pro Pro His Pro Ala Gln Ala Gln Leu                 20                  25                  30 CAG AGC CCTGAG AAG TTG CGC GAT GCC ATC AAA GTC ATG CAG AGG TTC 546 Gln Ser Pro GluLys Leu Arg Asp Ala Ile Lys Val Met Gln Arg Phe             35                  40                  45 GCG GGG CTG CCGGAG ACC GGC CGC ATG GAC CCA GGG ACA GTG GCC ACC 594 Ala Gly Leu Pro GluThr Gly Arg Met Asp Pro Gly Thr Val Ala Thr         50                  55                  60 ATG CGT AAG CCC CGCTGC TCC CTG CCT GAC GTG CTG GGG GTG GCG GGG 642 Met Arg Lys Pro Arg CysSer Leu Pro Asp Val Leu Gly Val Ala Gly     65                  70                  75 CTG GTC AGG CGG CGT CGCCGG TAC GCT CTG AGC GGC AGC GTG TGG AAG 690 Leu Val Arg Arg Arg Arg ArgTyr Ala Leu Ser Gly Ser Val Trp Lys 80                  85                  90                  95 AAG CGAACC CTG ACA TGG AGG GTA CGT TCC TTC CCC CAG AGC TCC CAG 738 Lys Arg ThrLeu Thr Trp Arg Val Arg Ser Phe Pro Gln Ser Ser Gln                100                 105                 110 CTG AGC CAGGAG ACC GTG CGG GTC CTC ATG AGC TAT GCC CTG ATG GCC 786 Leu Ser Gln GluThr Val Arg Val Leu Met Ser Tyr Ala Leu Met Ala            115                 120                 125 TGG GGC ATG GAGTCA GGC CTC ACA TTT CAT GAG GTG GAT TCC CCC CAG 834 Trp Gly Met Glu SerGly Leu Thr Phe His Glu Val Asp Ser Pro Gln        130                 135                 140 GGC CAG GAG CCC GACATC CTC ATC GAC TTT GCC CGC GCC TTC CAC CAG 882 Gly Gln Glu Pro Asp IleLeu Ile Asp Phe Ala Arg Ala Phe His Gln    145                 150                 155 GAC AGC TAC CCC TTC GACGGG TTG GGG GGC ACC CTA GCC CAT GCC TTC 930 Asp Ser Tyr Pro Phe Asp GlyLeu Gly Gly Thr Leu Ala His Ala Phe160                 165                 170                 175 TTC CCTGGG GAG CAC CCC ATC TCC GGG GAC ACT CAC TTT GAC GAT GAG 978 Phe Pro GlyGlu His Pro Ile Ser Gly Asp Thr His Phe Asp Asp Glu                180                 185                 190 GAG ACC TGGACT TTT GGG TCA AAA GAC GGC GAG GGG ACC GAC CTG TTT 1026 Glu Thr Trp ThrPhe Gly Ser Lys Asp Gly Glu Gly Thr Asp Leu Phe            195                 200                 205 GCC GTG GCT GTCCAT GAG TTT GGC CAC GCC CTG GGC CTG GGC CAC TCC 1074 Ala Val Ala Val HisGlu Phe Gly His Ala Leu Gly Leu Gly His Ser        210                 215                 220 TCA GCC CCC AAC TCCATT ATG AGG CCC TTC TAC CAG GGT CCG GTG GGC 1122 Ser Ala Pro Asn Ser IleMet Arg Pro Phe Tyr Gln Gly Pro Val Gly    225                 230                 235 GAC CCT GAC AAG TAC CGCCTG TCT CAG GAT GAC CGC GAT GGC CTG CAG 1170 Asp Pro Asp Lys Tyr Arg LeuSer Gln Asp Asp Arg Asp Gly Leu Gln240                 245                 250                 255 CAA CTCTAT GGG AAG GCG CCC CAA ACC CCA TAT GAC AAG CCC ACA AGG 1218 Gln Leu TyrGly Lys Ala Pro Gln Thr Pro Tyr Asp Lys Pro Thr Arg                260                 265                 270 AAA CCC CTGGCT CCT CCG CCC CAG CCC CCG GCC TCG CCC ACA CAC AGC 1266 Lys Pro Leu AlaPro Pro Pro Gln Pro Pro Ala Ser Pro Thr His Ser            275                 280                 285 CCA TCC TTC CCCATC CCT GAT CGA TGT GAG GGC AAT TTT GAC GCC ATC 1314 Pro Ser Phe Pro IlePro Asp Arg Cys Glu Gly Asn Phe Asp Ala Ile         290                 295                 300 GCC AAC ATC CGA GGGGAA ACT TTC TTC TTC AAA GGC CCC TGG TTC TGG 1362 Ala Asn Ile Arg Gly GluThr Phe Phe Phe Lys Gly Pro Trp Phe Trp    305                 310                 315 CGC CTC CAG CCC TCC GGACAG CTG GTG TCC CCG CGA CCC GCA CGG CTG 1410 Arg Leu Gln Pro Ser Gly GlnLeu Val Ser Pro Arg Pro Ala Arg Leu320                 325                 330                 335 CAC CGCTTC TGG GAG GGG CTG CCC GCC CAG GTG AGG GTG GTG CAG GCC 1458 His Arg PheTrp Glu Gly Leu Pro Ala Gln Val Arg Val Val Gln Ala                340                 345                 350 GCC TAT GCTCGG CAC CGA GAC GGC CGA ATC CTC CTC TTT AGC GGG CCC 1506 Ala Tyr Ala ArgHis Arg Asp Gly Arg Ile Leu Leu Phe Ser Gly Pro            355                 360                 365 CAG TTC TGG GTGTTC CAG GAC CGG CAG CTG GAG GGC GGG GCG CGG CCG 1554 Gln Phe Trp Val PheGln Asp Arg Gln Leu Glu Gly Gly Ala Arg Pro        370                 375                 380 CTC ACG GAG CTG GGGCTG CCC CCG GGA GAG GAG GTG GAC GCC GTG TTC 1602 Leu Thr Glu Leu Gly LeuPro Pro Gly Glu Glu Val Asp Ala Val Phe    385                 390                 395 TCG TGG CCA CAG AAC GGGAAG ACC TAC CTG GTC CGC GGC CGG CAG TAC 1650 Ser Trp Pro Gln Asn Gly LysThr Tyr Leu Val Arg Gly Arg Gln Tyr400                 405                 410                 415 TGG CGCTAC GAC GAG GCG GCG GCG CGC CCG GAC CCC GGC TAC CCT CGC 1698 Trp Arg TyrAsp Glu Ala Ala Ala Arg Pro Asp Pro Gly Tyr Pro Arg                420                 425                 430 GAC CTG AGCCTC TGG GAA GGC GCG CCC CCC TCC CCT GAC GAT GTC ACC 1746 Asp Leu Ser LeuTrp Glu Gly Ala Pro Pro Ser Pro Asp Asp Val Thr            435                 440                 445 GTC AGC AAC GCAGGT GAC ACC TAC TTC TTC AAG GGC GCC CAC TAC TGG 1794 Val Ser Asn Ala GlyAsp Thr Tyr Phe Phe Lys Gly Ala His Tyr Trp        450                 455                 460 CGC TTC CCC AAG AACAGC ATC AAG ACC GAG CCG GAC GCC CCC CAG CCC 1842 Arg Phe Pro Lys Asn SerIle Lys Thr Glu Pro Asp Ala Pro Gln Pro    465                 470                 475 ATG GGG CCC AAC TGG CTGGAC TGC CCC GCC CCG AGC TCT GGT CCC CGC 1890 Met Gly Pro Asn Trp Leu AspCys Pro Ala Pro Ser Ser Gly Pro Arg480                 485                 490                 495 GCC CCCAGG CCC CCC AAA GCG ACC CCC GTG TCC GAA ACC TGC GAT TGT 1938 Ala Pro ArgPro Pro Lys Ala Thr Pro Val Ser Glu Thr Cys Asp Cys                500                 505                 510 CAG TGC GAGCTC AAC CAG GCC GCA GGA CGT TGG CCT GCT CCC ATC CCG 1986 Gln Cys Glu LeuAsn Gln Ala Ala Gly Arg Trp Pro Ala Pro Ile Pro            515                 520                  525 CTG CTC CTC TTGCCC CTG CTG GTG GGG GGT GTA GCC TCC CGC 2028 Leu Leu Leu Leu Pro Leu LeuVal Gly Gly Val Ala Ser Arg        530                 535                 540 TGATGGGGGGAGCCATCCAG ACCGAACAGC GCCCTCCACG GCCGAGTCCC CCGCCGCTGG 2088 ACCTGGTCGGGGGTTGTGAG GCGCTGCGGA GGCCCCTTGT CTGTTCCCAC GGACGGGGGC 2148 TCGGGCGCGGACTAAGCAGG GGGGATCTCC CGCGCAGGGG CGGCGGCGGC GGGGACCGGT 2208 CGCCTGGCGCTGGGCTCAGT CTCCTCAGGG TCTGAGACCC CGGCGCTGCC ACCGGAACCC 2268 GCCTTCAGGGGCGCACGCGC GCTGGGACCA TGCGTCGGTC GTCGCCCCCG TCGTTCCCTC 2328 CCGGCTGCCGCCAGGGGGCG GTCGGACCCC GCCTCCCGAG CCCGGGGAGG GGCGGGGAGG 2388 ACAAGGGGCGGGCCCGCGGC CTCACCCGGA GGGACGGCAG CCCCGGTCGC GCGCTGGCCC 2448 CGCAGGACCTTCCTTTTCCA GGAAGAGCCA GCTTTTCTCG GAGCGCAGTC CTGGGACTCT 2508 CCGCAGCCCCGCCCCGCCTG GCCACTGCGT CTGGCATTCC TGGGTCGTTA GAGGACAGGC 2568 CTGACTGCGAAGCTGTGCCT TGCCCCTCTC CCACCCGCAG TTTCTCACCC CGTTCTGCTC 2628 CCACAAGGCCCCCCTACAGT CACTGCCACA CTGGTGGGGA CCTGGGACCC AGACCCGGAA 2688 CCAGCCCAGATATCACCCCT GAGGACCCAT GCGCCACGTC CTGGGTGGTG GAATCAGTGG 2748 CTGGAGGGACGACCCTTGCT CTCCAGGCTG TTAACCTTTT CCGTTGCTCC CCCGqCACCC 2808 ACCTCCTCCTCCCCAGGCCA CCCAACTTGG GCACCTCCCT GGGCCCAGAA CTGCCTTCCA 2868 TTCAATGGGGAACCCTTCTA TCCCCAAGAA CCCCTTCCCT GCTTGCACCC TGGAGAGAAC 2928 AGCTTGACTCCCATCAACTC AACGCTGGTG GAAAGACAGG GACCGAACCC TGGCTCAGGC 2988 CTGGTCATTGCCTCCTCAGC ACTCCCTCCT GGGAGGCCTT AGCTCTAGAG TGAGGGGTGG 3048 GTGGAACCTGGGGGCACCTC GTTCACCCTG TCCCCACTCC CCACAGTTTT AGGATCTAAA 3108 TGATTGCCTCTGGAACTATT CTTCTAGACT ATCCCACATC AGAATCACTG GGAAATTTAA 3168 GTTTGCAGATCCCACACTCA CCCTGAATCC TCACTCAGGG TGGGGTCAGG AATCTGCATT 3228 TTAACTAGTCGCGGGGATTG TGGGGGGCAG TAGCTGGCTG TTTCGTGGCA TTTCTGTGGC 3288 TCTGCAGTGTTCCTCCACCC CAGGACCAAT ATGTTCAGGC CACACCGATG GCCTGAACCC 3348 CATGGGTAGAGTCACTTAGG GGCCACTTCC TAAGTTGCTG TCCAGCCTCA GTGACCCCCT 3408 AGTGCTTCCTGGAGCTGAGG CTGTGGGCGG CTGTCCCAGC AACCACGCGA GGGGTTGCCC 3468 CAGTTGCTCATACAAACAGA TCAGCATGAG GACAGAAGGC AGGAGACTTT GGTCAGTTAC 3528 CTGGGAATTCTGGGCTGCCA GGAAACGATT TGGGCCTCTG TCAGTTTCTT TTCCATGTAT 3588 GAGGAGGGGGAAATTTGTAT ATTAGATACT TATTCATCCC ACTCTGGACA ATAAAAACGA 3648 ATGTACAAAAAAAACATAAA AAAAAAAAAT AAAGAAAATC AAA 3691 MRLRLRLLAL LLLLLAPPARAPKPSAQDVS LGVDWLTRYG YLPPPHPAQA QLQSPEKLRD AIKVMQRFAG LPETGRMDPGTVATMRKPRC SLPDVLGVAG LVRRRRRYAL SGSVWKKRTL TWRVRSFPQS SQLSQETVRVLMSYALMAWG MESGLTFHEV DSPQGQEPDI LIDFARAFHQ DSYPFDGLGG TLAHAFFPGEHPISGDTHFD DEETWTFGSK DGEGTDLFAV AVHEFGHALG LGHSSAPNSI MRPFYQGPVGDPDKYRLSQD DRDGLQQLYG KAPQTPYDKP TRKPLAPPPQ PPASPTHSPS FPIPDRCEGNFDAIANIRGE TFFFKGPWFW RLQPSGQLVS PRPARLHRFW EGLPAQVRVV QAAYARHRDGRILLFSGPQF WVFQDRQLEG GARPLTELGL PPGEEVDAVF SWPQNGKTYL VRGRQYWRYDEAAARPDPGY PRDLSLWEGA PPSPDDVTVS NAGDTYFFKG AHYWRFPKNS IKTEPDAPQPMGPNWLDCPA PSSGPRAPRP PKATPVSETC DCQCELNQAA GRWPAPIPLL LLPLLVGGVA SR

TABLE 2 Comparison of various MMPs witb F06B09. MT4-MMP (SEQ ID NO: 7)is from Genbank 3466295 (EMBL X89576); MT2-MMP (SEQ ID NO: 8) Z48482):MT1-MMP (SEQ ID NO: 9) is from Genbank 804994 (EMBL X83535; see alsoGenbank 1495995 (EMBL X90925) and 793763 (DDBJ D26512); and MT3-MMP (SEQID NO: 10) is from Genbank 2424979 (DDBJ D85511). MT4-MMP 1 0 F06B09 1           MRLRLRLLALLLLLLAPPARAPKPSAQDVSLGVDWLTRY 39 MT2-MMP 1 0MT1-MMP 1         MSPAPRPSRCLLLPLLTLGTALASLGSAQSSSFSPEAWLQQY 42 MT3-MMP1 MILLTFSTGRRLDFVHHSGVFFLQTLLWILCATVCGTEQYFNVEVWLQKY 50 MT4-MMP 1                         MQQFGGLEATGILDEATLALMKTPR 25 F06B09 40GYLPPPHPAQAQLQSPEKLRDAIKVMQRFAGLPETGRMDPGTVATMRKPR 89 MT2-MMP 1                                             MKRPR 5 MT1-MMP 43GYLPPGDLRTHTQRSPQSLSAAIAAMQKFYGLQVTGKADADTMKAMRRPR 92 MT3-MMP 51GYLPPTDPRMSVLRSAETMQSALAAMQQFYGINMTGKVDRNTIDWMKKPR 100                                             *. ** MT4-MMP 26CSLPDLP-VLTQARRR--RQ--APAPTKWNKRNLSWRVRTFPRDSPLGHD 70 F06B09 90CSLPDVL-GVAGLVRR--RRRYALSGSVKKKRTLTWRVRSFPQSSQLSQE 136 MT2-MMP 6CGVPDQFGVRVKANLRRRRKRYALTGRKWNNHHLTFSIQNYT--EKLGWY 53 MT1-MMP 93CGVPDKFGAEIKANVR--RKRYAIQGLKWQHNEITFCIQNYT--PKVGEY 138 MT3-MMP 101CGVPDQTRGSSKFHIR--RKRYALTGQKWQHKHITYSIKNVT--PKVGDP 146* .**          *  *.  *     *    ..  ..       . MT4-MMP 71TVRALMYYALKVWSDIAPLNFHEVA---GS-----TADIQIDFSKADHND 112 F06B09 137TVRVLMSYALMAWGMESGLTFHEVDSPQGQ-----EPDILIDFARAFHQD 181 MT2-MMP 54HSMEAVRRAFRVWEQATPLVFQEVPYEDIRLRRQKEADIMVLFASGFHGD 103 MT1-MMP 139ATYEAIRKAFRVWESATPLRFREVPYAYIREGHEKQADIMIFFAEGFHGD 188 MT3-MMP 147ETRKAIRRAFDVWQNVTPLTFEEVPYSELENGK-RDVDITIIFASGFHGD 195     .  *   *   . * * **             ** . *.   * * MT4-MMP 113GYPFDGPGGTVAHAFFPGHHHTAGDTHFDDDEAWTFRSSDAHGMDLFAVA 162 F06B09 182SYPFDGLGGTLAHAFFPGEHPISGDTHFDDEETWTFGSKDGEGTDLFAVA 231 MT2-MMP 104SSPFDGTGGFLAHAYFPGPG-LGGDTHFDADEPWTFSSTDLHGNNLFLVA 152 MT1-MMP 189STPFDGEGGFLAHAYFPGPN-IGGDTHFDSAEPWTVRNEDLNGNDIFLVA 237 MT3-MMP 196SSPFDGEGGFLAHAYFPGPG-IGGDTHFDSDEPWTLGNPNHDGNDLFLVA 244  **** ** .***.***     ******  * **       *  .* ** MT4-MMP 163VHEFGHAIGLSHVAAAHSIMRPYYQGPVGDPLRYGLPYEDKVRVWQLYGV 212 F06B09 232VHEFGHALGLGHSSAPNSIMRPFYQGPVGDPDKYRLSQDDRDGLQQLYG- 280 MT2-MMP 153VHELGHALGLEHSSNPNAIMAPFYQWKDVDN--FKLPEDDLRGIQQLYGT 200 MT1-MMP 238VHELGHALGLEHSSDPSAIMAPFYQWMDTEN--FVLPDDDRRGIQQLYGG 285 MT3-MMP 245VHELGHALGLEHSNDPTAIMAPFYQYMETDN--FKLPNDDLQGIQKIYGP 292*** ***.** *     .** *.**    .   . *  .*   . ..** MT4-MMP 213RESVSPTAQ--PEEPPLLP----------EP------------PDNRSSA 238 F06P09 281KAPQTPYDK--PTRKPLAP----------PPQ-----------PPASPTH 307 MT2-MMP 201PDGQPQPTQPLPTVTPRRPG-----RPDHRPPRPPQPPPPGGKPERPPKP 245 MT1-MMP 286ESG-------FPTKMPPQP------RTTSRP----------SVPDKPKNP 312 MT3-MMP 293PDKIPPPTRPLPTVPPHRSIPPADPRKNDRP-----------KPPRPPTG 331           *   *              *            * MT4-MMP 239PPR----------KDVPHRCSTHFDAVAQIRGEAFFFKGKYFWRLTRDRH 278 F06B09 308SPS----------FPIPDRCEGNFDAIANIRGETFFFKGPWFWRLQPSGQ 347 MT2-MMP 246GPPVQPRATERPDQYGPNICDGDFDTVAMLRGEMFVFKGRWFWRVRHNR- 294 MT1-MMP 313-------------TYGPNICDGNFDTVAMLRGEMFVFKERWFWRVRNNQ- 348 MT3-MMP 332RPS--------YPGAKPNICDGNFNTLAILRREMFVFKDQWFWRVRNNR- 372                *  *   * ..* .* * * **   ***. MT4-MMP 279LVSLQPAQMHRFWRGLPLHLDSVDAVYERTSDHKIVFFKGDRYWVFKDNN 328 F06B09 348LVSPRPARLHRFWEGLPAQVRVVQAAYARHRDGRILLFSGPQFWVFQDR- 396 MT2-MMP 295VLDNYPMPIGHFWRGLPGDIS---AAYERQ-DGRFVFFKGDRYWLFREAN 340 MT1-MMP 349VMDGYPMPIGQFWRGLPASIN---TAYERK-DGKFVFFKGDKHWVFDEAS 394 MT3-MMP 373VMDGYPMQITYFWRGLPPSID---AVYENS-DGNFVFFKGNKYWVFKDTT 418..   *  .  ** ***  .    . *    *   . * * . *.* . MT4-MMP 329VEEGYPRPVSDFSLP--PGG-IDAAFSWAHNDRTYFFKDQLYWRYDDHTR 375 F06B09 397QLEGGARPLTELGLP--PGEEVDAVFSWPQNGKTYLVRGRQYWRYDEAAA 444 MT2-MMP 341LEPGYPQPLTSYGL-GIPYDRIDTAIWWEPTGHTFFFQEDRYWRFNEETQ 389 MT1-MMP 395LEPGYPKHIKELGR-GLPTDKIDAALFWMPNGKTYFFRGNKYYRFNEELR 443 MT3-MMP 419LQPGYPHDLITLGS-GIPPHGIDSAIWWEDVGKTYFFKGDRYWRYSEEMK 467   *  . .        *   .*.   *    .*.  .   * *. . MT4-MMP 376HMDPGYPAQSPLWRGVPSTLDDAMRWS-DGASYFFRGQEYWKVLDGELEV 424 F06B09 445RPDPGYPRDLSLWEGAPPSPDDVTVSN-AGDTYFFKGAHYWRFPKNSIKT 493 MT2-MMP 390RGDPGYPKPISVWQGIPASPKGAFLSNDAAYTYFYKGTKYWKFDNERLRM 439 MT1-MMP 444AVDSEYPKNIKVWEGIPESPRGSFMGSDEVFTYFYKGNKYWKFNNQKLKV 493 MT3-MMP 468TMDPGYPKPITVWKGIPESPQGAFVHKENGFTYFYKGKEYWKFNNQILKV 517  *  **    .* * * .            .**..*  **.     . MT4-MMP 425APGYPQSTARDWLVCGDSQADGSVAAGV-------DAAEGPRAPPGQHDQ 467 F06B09 494EPDAPQPMGPNWLDCPAP-------------------SSGPRAP----RP 520 MT2-MMP 440EPGYPKSILRDFMGCQEHVEPGPRWPDVARPPFNPHGGAEPGADSAEGDV 489 MT1-MMP 494EPGYPKSALRDWMGCPSGGRPDE--------------GTEEETEVIIIEV 529 MT3-MMP 518EPGYPRSILKDFMGCDGPTDRVKEG------------HSPPDDVDIVIKL 555  *  *.      . *MT4-MMP 468 SRS--------------------EDGYEVCSCTSGASSPPGAPGPLVAAT 497F06B09 521 PKA--------------------TPVSETCDCQCELN---QAAGRWPAPI 547MT2-MMP 490 GDGDGDFGAGVNKDRGSRVVVQMEEVARTVNVVMVLVPLLLLLCVLGLTY 539MT1-MMP 530 D----------------------EEGGGAVSAAAVVLPVLLLLLVLAVGL 557MT3-MMP 556 DN-----------------------TASTVKAIAIVIPCILALCLLVLVY 582MT4-MMP 498 MLLLLP-PLSPGALWTAAQALTL 519 F06B09 548 PLLLLP-LLVGGVASR 562MT2-MMP 540 ALVQMQRKGAPRVLLYCKRSLQEWV 564 MT1-MMP 558AVFFFRRHGTPRRLLYCQRSLLDKV 582 MT3-MMP 583 TVFQFKRKGTPRHILYCKRSMQEWV 607

In the search for DC specific genes, a novel matrix-metalloproteinasehomologue (MMP) from the Memrbrane-type Matrix Metalloproteinases(MT-MMP) family subclass was identified. Of interest, this novel genedesignated F06B09 is predominantly expressed by both CD34⁺- andmonocyte-derived DC and is down-regulated after DC maturation by CD40L Lcells. The deduced protein sequence of F06B09 is clearly a member of theMMP family, characterized by the presence of a prodomain with theactivation locus containing the essential cysteine residue, a catalyticdomain including the zinc-binding site with the consensus sequenceHExGHxxxxxH and an hemopexin-like domain (Birkedal-Hansen (1995) Curr.Opin. Cell Biol. 7:728-735; Shapiro (1998) Curr. Opin. Cell Biol.10:602-608). MMPs belonging to the metzincin (or Clan) superfamily, canbe classified into at least four subfamilies of closely related members:collagenases, stromelysins, gelatinases, and MT-MMP, although there aresome MMPs like the macrophage metalloelastase (Belaaouaj, et al. (1995)J. Biol. Chem. 270:14568-14575) and the stromelysin 3 (Basset, et al.(1990) Nature 348:699-704) that do not belong to these subclasses.According to its structural characteristic and its high level ofhomology with MT4-MMP (Puente, et al. (1996) Cancer Res. 56:944-949),F06B09 represents the fifth member of the MT-MMP subclass. All MT-MMPspresent a putative transmembrane domain in the C-terminal portion and acharacteristic insertion between the propeptide and the catalytic domaincontaining the consensus amino acid sequence RxK/RR (Basbaum and Werb(1996) Curr. Opin. Cell Biol. 8:731-738). In order to activate MT-MMPs,this site is cleaved by enzymes called furin. Like other MMPs ormatrixins, the novel 14T-MMP contains noncatalytic domains, in additionto the protease domain, which are likely involved in interactions withsubstrates or other proteins. MT-MMPs have the ability to cleavesubstrates, e.g., other MMPs. Effectively, MT1-, MT2- and MT3-MMP canactivate proMMP2 (progelatinase A) into MMP2 on the cell surface(Butler, et al. (1997) Eur. J. Biochem. 244:653-657; Knauper, et al.(1996) J. Biol. Chem. 271:17124-17131; Kolkenbrock, et al. (1997) Biol.Chem. 378:71-76; Sato, et al. (1994) Nature 370:61-65; Strongin, et al.(1993) J. Biol. Chem. 268:14033-14039; and Takino, et al. (1995) J.Biol. Chem. 270:23013-23020), which as an active form degrades type IVcollagen, the major component of basement membranes (Wilhelm, et al.(1989) J. Biol. Chem. 264:17213-17221). Thus, active MMP-2 (gelatinaseA) plays a key role in the invasion of migrating cells into tissues.Several reports demonstrate that an overexpression of MMPs, especiallyMMP-2 and MMP-9, is associated with the invasive behavior of tumor cells(Shapiro (1998) Curr. Opin. Cell Biol. 10:602-608; Stetler-Stevenson, etal. (1993) Annu. Rev. Cell Biol. 9:541-573). Concerning the MT4-MMP, themost homologous gene to F06B09, it is uncertain whether this MT-MMP canactivate proMMP-2. Phylogenetic analysis shows that among the MT-MMPs,MT4-MMP and F06B09 are distinguished from others and form a group ofclosely related proteins. It is speculated that these two MT-MMP sharesimilar biological activities. It has been shown than MT-MMPs areoverexpressed in cancers; in particular, high levels of MT1-MMP areassociated with invasiveness of cervical cancer cells (Gilles, et al.(1996) Int. J. Cancer. 65:209-213), breast, colon, neck and lungcarcinomas (Okada, et al. (1995) Proc. Natl. Acad. Sci. USA.92:2730-2734; Sato, et al. (1994) Nature 370:61-65; Ueno, et al. (1997)Cancer Res. 57:2055-2060) and gastric cancers (Mori, et al. (1997) Int.J. Cancer. 74:316-321). Furthermore, MT4-MMP was isolated from a breastcarcinoma (Puente, et al. (1996) Cancer Res. 56:944-949). Interestingly,the novel MT-MMP identified in the present report constitutes the firstMT-MMP isolated from a DC library. It will be interesting to furtherstudy F06B09 expression in pathological tissues.

MT-MMP participate not only to the control of cell migration and tissueremodeling through their involvement in degradation of extracellularmatrix and basement membranes, but also in the processing of pro-enzymesand pro-cytokines.

Of note, the genes for human MT1-MMP, 2, and 3 have been localized onthree different chromosomes by in situ hybridization (Mattei, et al.(1997) Genomics. 40:168-169; Mignon, et al. (1995) Genomics.28:360-361), and while the novel MT-MMP gene is on the same chromosomeas MT2-MMP, both genes are on different loci: MT2-MMP is on chromosome16q12 (Mattei, et al. (1997) Genomics. 40:168-169; Yasumitsu, et al.(1997) DNA Res. 4:77-79), and the novel MT-MMP is on chromosome 16p13.3.

The predominant expression of F06B09 in immature DC, and its putativemembrane localization, suggests a role for this MT-MMP in degradation ofextracellular matrix during DC migration. Furthermore, like MT1-MMP,which may trigger invasion by tumor cells by activating progelatinase Aon tumor cell surface, this novel MT-MMP could be involved in cancerinvasion.

The proteins of this invention are defined in part by their sequences,and by their physicochemical and biological properties. The biologicalproperties of the human proteases described herein, e.g., human F06B09,are defined by their amino acid sequences, and mature sizes. They alsoshould share certain biological enzymatic properties of their respectiveproteins.

The human protease F06B09 translation product exhibits structural motifsof a member of the matrix metalloproteinase family of proteases, morespecifically to a family of matrix degrading proteinases. Theseproteins, in the latent form, typically possess a prodomain form whichmasks the catalytic site, which chelates a zinc ion. See Vallee and Auld(1990) Biochemistry 29:5647-5659. The processed mature protein istypically a potent cell-matrix degrading enzyme. See, e.g.,Birkedal-Hansen (1990) Proc. Nat'l. Acad. Sci. USA 87:5578-5582. Theenzyme may remain attached to the cell membrane after activation, andmany of these proteases may localize to the leading cellular processeswhen the cell migrates. This may suggest other protein-proteininteractions, e.g., with domains which specifically localize the enzymesby cytoskeletal or other mechanisms.

The pro-enzyme activation (furin) site would correspond to the Argstretch from 86-90 of SEQ ID NO: 2; or 212-222 of SEQ ID NO: 4. It islikely that the activating enzyme will be one of the furin/PACEproteases, which are essentially ubiquitously expressed. By analogy tothe MT1-MMP, these proteases activate other proteases, e.g., gelatinaseA, collagenases, and others, that assist in the degradation of theextracellular matrix. See, e.g., Basbaum and Werb (1996) Current Opinionin Cell Biology 8:731-738.

F06B09 contains the zinc binding peptide consensus at about residues212-222 of SEQ ID NO: 4, with characteristic His residues at 212, 216,and 222. There is also a hemopexin-like domain at about residues 291-525and a matrixin-like domain corresponding to about residues 1-211.Natural substrates for the proteinase may be identified using standardmethods. Substrate sequence specificity may be determined, and searchfor such sequences in databases may identify specific candidates forphysiological substrates.

One of skill will readily recognize that some sequence variations may betolerated, e.g., conservative substitutions or positions remote from thecritical helical structures and remote from the identified or consensuscritical active site regions, without altering significantly thebiological activity of each respective molecule.

F06B09 proteins are present in specific cell types, e.g., dendriticcells, and the interaction of the protease with a substrate will beimportant for mediating various aspects of cellular physiology ordevelopment. The cellular types which express messages encoding F06B09suggest that signals important in cell differentiation and developmentare mediated by them. See, e.g., Gilbert (1991) Developmental Biology(3d ed.) Sinauer Associates, Sunderland, Mass.; Browder, et al. (1991)Developmental Biology (3d ed.) Saunders, Philadelphia, Pa.; Russo, etal. (1992) Development: The Molecular Genetic Approach Springer-Verlag,New York, N.Y.; and Wilkins (1993) Genetic Analysis of AnimalDevelopment (2d ed.) Wiley-Liss, New York, N.Y. In particular, theproteases may be necessary for the conversion of pro-proteins toproteins, e.g., cytokine or protein precursors to mature forms, or forproper immunological function, e.g., antigen processing andpresentation. Alternatively, the proteases may be important in dendriticcell trafficking, e.g., to traverse through extracellular matrix orvascular surfaces.

II. Definitions

The term “binding composition” refers to molecules that bind withspecificity to F06B09, e.g., in an antibody-antigen interaction.However, other compounds, e.g., complex associated proteins, may alsospecifically associate with F06B09 to the exclusion of other molecules.Typically, the association will be in a natural physiologically relevantprotein-protein interaction, either covalent or non-covalent, and mayinclude members of a multiprotein complex, including carrier compoundsor dimerization partners. The molecule may be a polymer, or chemicalreagent. A functional analog may be a protease with structuralmodifications, or may be a wholly unrelated molecule, e.g., which has amolecular shape which interacts with the appropriate substrate cleavagedeterminants.

The term “binding agent: F06B09 protein complex”, as used herein, refersto a complex of a binding agent and an F06B09 protein that is formed byspecific binding of the binding agent to the F06B09 protein. Specificbinding of the binding agent means that the binding agent has a specificbinding site that recognizes a site on the F06B09 protein, typically inthe native conformation, but possibly in a denatured conformation, e.g.,a Western blot. For example, antibodies raised to an F06B09 protein andrecognizing an epitope on the F06B09 protein are capable of forming abinding agent:F06B09 protein complex by specific binding. Typically, theformation of a binding agent: F06B09 protein complex allows themeasurement of F06B09 protein in a biological sample, e.g., a mixturewith other proteins and biologics. The term “antibody: F06B09 proteincomplex” refers to an embodiment in which the binding agent is anantibody. The antibody may be monoclonal, polyclonal, or a bindingfragment of an antibody, e.g., an Fab, F(ab)2, or Fv fragment. Theantibody will preferably be a polyclonal antibody for cross-reactivitydeterminations.

“Homologous” nucleic acid sequences, when compared, exhibit significantsimilarity or identity. The standards for homology in nucleic acids areeither measures for homology generally used in the art by sequencecomparison and/or phylogenetic relationship, or based upon hybridizationconditions. Hybridization conditions are described in greater detailbelow.

An “isolated” nucleic acid is a nucleic acid, e.g., an RNA, DNA, or amixed polymer, which is substantially separated from other biologiccomponents which naturally accompany a native sequence, e.g., proteinsand flanking genomic sequences from the originating species. The termembraces a nucleic acid sequence which has been removed from itsnaturally occurring environment, and includes recombinant or cloned DNAisolates and chemically synthesized analogs, or analogs biologicallysynthesized by heterologous systems. A substantially pure moleculeincludes isolated forms of the molecule. An isolated nucleic acid willusually contain homogeneous nucleic acid molecules, but will, in someembodiments, contain nucleic acids with minor sequence heterogeneity.This heterogeneity is typically found at the polymer ends or portionsnot critical to a desired biological function or activity.

As used herein, the term “F06B09” protein shall encompass, when used ina protein context, a protein having amino acid sequences, particularlyfrom the protein motif portions, shown in SEQ ID NO: 2 or 4,respectively. In many contexts, a significant fragment of such a proteinwill be functionally equivalent. The invention also embraces apolypeptide which exhibits similar structure to human F06B09 protein,e.g., which interacts with F06B09 specific binding components. Thesebinding components, e.g., antibodies, typically bind to F06B09 proteinwith high affinity, e.g., at least about 100 nM, usually better thanabout 30 nM, preferably better than about 10 nM, and more preferably atbetter than about 3 nM.

The term “polypeptide” or “protein” as used herein includes asignificant fragment or segment of protease motif portion of F06B09protein, and encompasses a stretch of amino acid residues of at leastabout 8 amino acids, generally at least about 10 amino acids, moregenerally at least about 12 amino acids, often at least about 14 aminoacids, more often at least about 16 amino acids, typically at leastabout 18 amino acids, more typically at least about 20 amino acids,usually at least about 22 amino acids, more usually at least about 24amino acids, preferably at least about 26 amino acids, more preferablyat least about 28 amino acids, and, in particularly preferredembodiments, at least about 30 or more amino acids, e.g., 35, 40, 45,50, 60, 70, 80, 100, etc. Preferred ends of such polypeptides willcorrespond to a motif or boundary described above, e.g., in Table 1.Preferably, a polypeptide will contain a plurality of distinct, e.g.,discrete or nonoverlapping, segments of the specified length. Typically,the plurality will be at least two, more usually at least three, andpreferably 5, 7, or even more. While the length minima are provided,longer lengths, of various sizes, may be appropriate, e.g., one oflength 7, and two of length 12.

A “recombinant” nucleic acid is defined either by its method ofproduction or its structure. In reference to its method of production,e.g., a product made by a process, the process is use of recombinantnucleic acid techniques, e.g., involving human intervention in thenucleotide sequence, typically selection or production. Alternatively,it can be a nucleic acid made by generating a sequence comprising fusionof two fragments which are not naturally contiguous to each other, butis meant to exclude products of nature, e.g., naturally occurringmutants. Thus, for example, products made by transforming cells with anynon-naturally occurring vector is encompassed, as are nucleic acidscomprising sequence derived using any synthetic oligonucleotide process.Such is often done to replace a codon with a redundant codon encodingthe same or a conservative amino acid, while typically introducing orremoving a sequence recognition site. Alternatively, it is performed tojoin together nucleic acid segments of desired functions to generate asingle genetic entity comprising a desired combination of functions notfound in the commonly available natural forms. Restriction enzymerecognition sites are often the target of such artificial manipulations,but other site specific targets, e.g., promoters, DNA replication sites,regulation sequences, control sequences, or other useful features may beincorporated by design. A similar concept is intended for a recombinant,e.g., fusion, polypeptide. Specifically included are synthetic nucleicacids which, by genetic code redundancy, encode polypeptides similar tofragments of these antigens, and fusions of sequences from variousdifferent species variants.

“Solubility” is reflected by sedimentation measured in Svedberg units,which are a measure of the sedimentation velocity of a molecule underparticular conditions. The determination of the sedimentation velocitywas classically performed in an analytical ultracentrifuge, but istypically now performed in a standard ultracentrifuge. See, Freifelder(1982) Physical Biochemistry (2d ed.) W. H. Freeman & Co., SanFrancisco, Calif.; and Cantor and Schimmel (1980) Biophysical Chemistryparts 1-3, W. H. Freeman & Co., San Francisco, Calif. As a crudedetermination, a sample containing a putatively soluble polypeptide isspun in a standard full sized ultracentrifuge at about 50 K rpm forabout 10 minutes, and soluble molecules will remain in the supernatant.A soluble particle or polypeptide will typically be less than about 30S, more typically less than about 15 S, usually less than about 10 S,more usually less than about 6 S, and, in particular embodiments,preferably less than about 4S, and more preferably less than about 3S.Solubility of a polypeptide or fragment depends upon the environment andthe polypeptide. Many parameters affect polypeptide solubility,including temperature, electrolyte environment, size and molecularcharacteristics of the polypeptide, and nature of the solvent.Typically, the temperature at which the polypeptide is used ranges fromabout 40° C. to about 65° C. Usually the temperature at use is greaterthan about 18° C. and more usually greater than about 22° C. Fordiagnostic purposes, the temperature will usually be about roomtemperature or warmer, but less than the denaturation temperature ofcomponents in the assay. For therapeutic purposes, the temperature willusually be body temperature, typically about 37° C. for humans, thoughunder certain situations the temperature may be raised or lowered insitu or in vitro.

The size and structure of the polypeptide should generally be in asubstantially stable state, and usually not in a denatured state. Thepolypeptide may be associated with other polypeptides in a quaternarystructure, e.g., to confer solubility, or associated with lipids ordetergents in a manner which approximates natural lipid bilayerinteractions.

The solvent will usually be a biologically compatible buffer, of a typeused for preservation of biological activities, and will usuallyapproximate a physiological solvent. Usually the solvent will have aneutral pH, typically between about 5 and 10, and preferably about 7.5.On some occasions, a detergent will be added, typically a mildnon-denaturing one, e.g., CHS (cholesteryl hemisuccinate) or CHAPS(3-[3-cholamidopropyl)-dimethylammonio]-1-propane sulfonate), or a lowenough concentration as to avoid significant disruption of structural orphysiological properties of the protein.

“Substantially pure” in a protein context typically means that theprotein is isolated from other contaminating proteins, nucleic acids,and other biologicals derived from the original source organism. Purity,or “isolation” may be assayed by standard methods, and will ordinarilybe at least about 50% pure, more ordinarily at least about 60% pure,generally at least about 70% pure, more generally at least about 80%pure, often at least about 85% pure, more often at least about 90% pure,preferably at least about 95% pure, more preferably at least about 98%pure, and in most preferred embodiments, at least 99% pure. Similarconcepts apply, e.g., to antibodies or nucleic acids.

“Substantial similarity” in the nucleic acid sequence comparison contextmeans either that the segments, or their complementary strands, whencompared, are identical when optimally aligned, with appropriatenucleotide insertions or deletions, in at least about 50% of thenucleotides, generally at least about 56%, more generally at least about59%, ordinarily at least about 62%, more ordinarily at least about 65%,often at least about 68%, more often at least about 71%, typically atleast about 74%, more typically at least about 77%, usually at leastabout 80%, more usually at least about 85%, preferably at least about90%, more preferably at least about 95 to 98% or more, and in particularembodiments, as high at about 99% or more of the nucleotides.Alternatively, substantial similarity exists when the segments willhybridize under selective hybridization conditions, to a strand, or itscomplement, typically using a sequence derived from SEQ ID NO: 1 or 3.Typically, selective hybridization will occur when there is at leastabout 55% similarity over a stretch of at least about 30 nucleotides,preferably at least about 65% over a stretch of at least about 25nucleotides, more preferably at least about 75%, and most preferably atleast about 90% over about 20 nucleotides. See Kanehisa (1984) Nucl.Acids Res. 12:203-213. The length of similarity comparison, asdescribed, may be over longer stretches, and in certain embodiments willbe over a stretch of at least about 17 nucleotides, usually at leastabout 20 nucleotides, more usually at least about 24 nucleotides,typically at least about 28 nucleotides, more typically at least about40 nucleotides, preferably at least about 50 nucleotides, and morepreferably at least about 75 to 100 or more nucleotides, e.g., 150, 200,etc. Various combinations of plurality of such segments will also bemade.

For sequence comparison, typically one sequence acts as a referencesequence, to which test sequences are compared. When using a sequencecomparison algorithm, test and reference sequences are input into acomputer, subsequence coordinates are designated, if necessary, andsequence algorithm program parameters are designated. The sequencecomparison algorithm then calculates the percent sequence identity forthe test sequence(s) relative to the reference sequence, based on thedesignated program parameters.

Optical alignment of sequences for comparison can be conducted, e.g., bythe local homology algorithm of Smith and Waterman (1981) Adv. Appl.Math. 2:482, by the homology alignment algorithm of Needlman and Wunsch(1970) J. Mol. Biol. 48:443, by the search for similarity method ofPearson and Lipman (1988) Proc. Nat'l. Acad. Sci. USA 85:2444, bycomputerized implementations of these algorithms (GAP, BESTFIT, FASTA,and TFASTA in the Wisconsin Genetics Software Package, Genetics ComputerGroup, 575 Science Dr., Madison, Wis.), or by visual inspection (seegenerally Ausubel et al., supra).

One example of a useful algorithm is PILEUP. PILEUP creates a multiplesequence alignment from a group of related sequences using progressive,pairwise alignments to show relationship and percent sequence identity.It also plots a tree or dendogram showing the clustering relationshipsused to create the alignment. PILEUP uses a simplification of theprogressive alignment method of Feng and Doolittle (1987) J. Mol. Evol.35:351-360. The method used is similar to the method described byHiggins and Sharp (1989) CABIOS 5:151-153. The program can align up to300 sequences, each of a maximum length of 5,000 nucleotides or aminoacids. The multiple alignment procedure begins with the pairwisealignment of the two most similar sequences, producing a cluster of twoaligned sequences. This cluster is then aligned to the next most relatedsequence or cluster of aligned sequences. Two clusters of sequences arealigned by a simple extension of the pairwise alignment of twoindividual sequences. The final alignment is achieved by a series ofprogressive, pairwise alignments. The program is run by designatingspecific sequences and their amino acid or nucleotide coordinates forregions of sequence comparison and by designating the programparameters. For example, a reference sequence can be compared to othertest sequences to determine the percent sequence identity relationshipusing the following parameters: default gap weight (3.00), default gaplength weight (0.10), and weighted end gaps.

Another example of algorithm that is suitable for determining percentsequence identity and sequence similarity is the BLAST algorithm, whichis described Altschul, et al. (1990) J. Mol. Biol. 215:403-410. Softwarefor performing BLAST analyses is publicly available through the NationalCenter for Biotechnology Information (http:www.ncbi.nlm.nih.gov/). Thisalgorithm involves first identifying high scoring sequence pairs (HSPs)by identifying short words of length W in the query sequence, whicheither match or satisfy some positive-valued threshold score T whenaligned with a word of the same length in a database sequence. T isreferred to as the neighborhood word score threshold (Altschul, et al.,supra). These initial neighborhood word hits act as seeds for initiatingsearches to find longer HSPs containing them. The word hits are thenextended in both directions along each sequence for as far as thecumulative alignment score can be increased. Extension of the word hitsin each direction are halted when: the cumulative alignment score fallsoff by the quantity X from its maximum achieved value; the cumulativescore goes to zero or below, due to the accumulation of one or morenegative-scoring residue alignments; or the end of either sequence isreached. The BLAST algorithm parameters W, T, and X determine thesensitivity and speed of the alignment. The BLAST program uses asdefaults a wordlength (W) of 11, the BLOSUM62 scoring matrix (seeHenikoff and Henikoff (1989) Proc. Nat'l. Acad. Sci. USA 89:10915)alignments (B) of 50, expectation (E) of 10, M=5, N=4, and a comparisonof both strands.

In addition to calculating percent sequence identity, the BLASTalgorithm also performs a statistical analysis of the similarity betweentwo sequences (see, e.g., Karlin and Altschul (1993) Proc. Nat'l. Acad.Sci. USA 90:5873-5787). One measure of similarity provided by the BLASTalgorithm is the smallest sum probability (P(N)), which provides anindication of the probability by which a match between two nucleotide oramino acid sequences would occur by chance. For example, a nucleic acidis considered similar to a reference sequence if the smallest sumprobability in a comparison of the test nucleic acid to the referencenucleic acid is less than about 0.1, more preferably less than about0.01, and most preferably less than about 0.001.

A further indication that two nucleic acid sequences of polypeptides aresubstantially identical is that the polypeptide encoded by the firstnucleic acid is immunologically cross reactive with the polypeptideencoded by the second nucleic acid, as described below. Thus, apolypeptide is typically substantially identical to a secondpolypeptide, for example, where the two peptides differ only byconservative substitutions. Another indication that two nucleic acidsequences are substantially identical is that the two moleculeshybridize to each other under stringent conditions, as described below.

“Stringent conditions”, in referring to homology or substantialsimilarity in the hybridization context, will be stringent combinedconditions of salt, temperature, organic solvents, and other parameters,typically those controlled in hybridization reactions. The combinationof parameters is generally more important than the measure of any singleparameter. See, e.g., Wetmur and Davidson (1968) J. Mol. Biol.31:349-370. A nucleic acid probe which binds to a target nucleic acidunder stringent conditions is specific for said target nucleic acid.Such a probe is typically more than 11 nucleotides in length, and issufficiently identical or complementary to a target nucleic acid overthe region specified by the sequence of the probe to bind the targetunder stringent hybridization conditions. Hybridization under stringentconditions should give a background of at least 2-fold over background,preferably at least 3-5 or more.

F06B09 proteins from other mammalian species can be cloned and isolatedby cross-species hybridization of closely related species. See, e.g.,below. Similarity may be relatively low between distantly relatedspecies, and thus hybridization of relatively closely related species isadvisable. Alternatively, preparation of an antibody preparation whichexhibits less species specificity may be useful in expression cloningapproaches.

The phrase “specifically binds to an antibody” or “specificallyimmunoreactive with”, when referring to a protein or peptide, refers toa binding reaction which is determinative of the presence of the proteinin the presence of a heterogeneous population of proteins and otherbiological components. Thus, under designated immunoassay conditions,the specified antibodies bind to a particular protein and do notsignificantly bind other proteins present in the sample. Specificbinding to an antibody under such conditions may require an antibodythat is selected for its specificity for a particular protein. Forexample, antibodies raised to the human F06B09 protein immunogen withthe amino acid sequence depicted in SEQ ID NO: 2 or 4 can be selected byimmunoaffinity or similar methods to obtain antibodies specificallyimmumoreactive with F06B09 proteins and not with other proteins.

III. Nucleic Acids

F06B09 proteins are exemplary of larger classes of structurally andfunctionally related proteins. The F06B09 proteins will typically serveto cleave or process various proteins produced or processed by variouscell types, e.g., for antigen presentation. The preferred embodiments,as disclosed, will be useful in standard procedures to isolate genesfrom different individuals or other species, e.g., warm blooded animals,such as birds and mammals. Cross hybridization will allow isolation ofrelated genes encoding proteins from individuals, strains, or species. Anumber of different approaches are available to successfully isolate asuitable nucleic acid clone based upon the information provided herein.Southern blot hybridization studies can qualitatively determine thepresence of homologous genes in human, monkey, rat, dog, cow, and rabbitgenomes under specific hybridization conditions.

Complementary sequences will also be used as probes or primers. Basedupon identification of the likely amino terminus, other peptides shouldbe particularly useful, e.g., coupled with anchored vector or poly-Acomplementary PCR techniques or with complementary DNA of otherpeptides.

Techniques for nucleic acid manipulation of genes encoding F06B09proteins, such as subcloning nucleic acid sequences encodingpolypeptides into expression vectors, labeling probes, DNAhybridization, and the like are described generally in Sambrook, et al.(1989) Molecular Cloning: A Laboratory Manual (2nd ed.) Vol. 1-3, ColdSpring Harbor Laboratory, Cold Spring Harbor Press, NY, which isincorporated herein by reference. This manual is hereinafter referred toas “Sambrook, et al.”

There are various methods of isolating DNA sequences encoding F06B09proteins. For example, DNA is isolated from a genomic or cDNA libraryusing labeled oligonucleotide probes having sequences identical orcomplementary to the sequences disclosed herein. Full-length probes maybe used, or oligonucleotide probes may be generated by comparison of thesequences disclosed. Such probes can be used directly in hybridizationassays to isolate DNA encoding F06B09 proteins, or probes can bedesigned for use in amplification techniques such as PCR, for theisolation of DNA encoding F06B09 proteins.

To prepare a cDNA library, mRNA is isolated from cells, preferably whichexpress high levels of an F06B09 protein. cDNA is prepared from the mRNAand ligated, e.g., into a recombinant vector. The vector is transfectedinto a recombinant host for propagation, screening, and cloning. Methodsfor making and screening cDNA libraries are well known. See Gubler andHoffman (1983) Gene 25:263-269 and Sambrook, et al.

For a genomic library, the DNA can be extracted from tissue, and ofteneither mechanically sheared or enzymatically digested to yield fragmentsof about 12-20 kb. The fragments are then separated by gradientcentrifugation and cloned in bacteriophage lambda vectors. These vectorsand phage are packaged in vitro, as described in Sambrook, et al.Recombinant phage are analyzed by plaque hybridization as described inBenton and Davis (1977) Science 196:180-182. Colony hybridization iscarried out as generally described in, e.g., Grunstein, et al. (1975)Proc. Natl. Acad. Sci. USA. 72:3961-3965.

DNA encoding an F06B09 protein can be identified in either cDNA orgenomic libraries by its ability to hybridize with the nucleic acidprobes described herein, e.g., in colony or plaque hybridization assays.The corresponding DNA regions are isolated, e.g., by standard methodsfamiliar to those of skill in the art. See, e.g., Sambrook, et al.

Various methods of amplifying target sequences, such as the polymerasechain reaction, can also be used to prepare DNA encoding F06B09proteins. Polymerase chain reaction (PCR) technology may be used toamplify such nucleic acid sequences directly from mRNA, from cDNA,and/or from genomic libraries or cDNA libraries. The isolated sequencesencoding F06B09 proteins may also be used as templates for PCRamplification.

Typically, in PCR techniques, oligonucleotide primers complementary totwo flanking regions in the DNA region to be amplified are synthesized.The polymerase chain reaction is then carried out using the two primers.See Innis, et al. (eds. 1990) PCR Protocols: A Guide to Methods andApplications Academic Press, San Diego, Calif. Primers can be selectedto amplify the entire regions encoding a full-length human F06B09protein or to amplify smaller DNA segments, as desired. Once suchregions are PCR-amplified, they can be sequenced and oligonucleotideprobes can be prepared from sequence obtained using standard techniques.These probes can then be used to isolate DNA's encoding F06B09 proteins.

Oligonucleotides for use as probes are usually chemically synthesizedaccording to the solid phase phosphoramidite triester method firstdescribed by Beaucage and Carruthers (1983) Tetrahedron Lett.22(20):1859-1862, or using an automated synthesizer, as described inNeedham-VanDevanter, et al. (1984) Nucleic Acids Res. 12:6159-6168.Purification of oligonucleotides is performed e.g., by native acrylamidegel electrophoresis or by anion-exchange HPLC as described in Pearsonand Regnier (1983) J. Chrom. 255:137-149. The sequence of the syntheticoligonucleotide can be verified using, e.g., the chemical degradationmethod of Maxam, A. M. and Gilbert, W. in Grossman and Moldave (eds.1980) Methods in Enzymology 65:499-560, Academic Press, New York.

An isolated nucleic acid encoding a human F06B09 protein was identified.The nucleotide sequence, corresponding open reading frames, and maturepeptides are provided in Table 1 or SEQ ID NO: 1 or 3.

This invention provides isolated DNA or fragments to encode an F06B09protein or specific fragment thereof. In addition, this inventionprovides isolated or recombinant DNA which encodes a protein orpolypeptide, and which is capable of hybridizing under appropriateconditions, e.g., high stringency, with the DNA sequences describedherein. Said biologically active protein or polypeptide can be afunctional protease segment, or fragment, and have an amino acidsequence as disclosed in SEQ ID NO: 2 or 4. Preferred embodiments willbe full length natural sequences, from isolates, or proteolyticfragments thereof. Further, this invention contemplates the use ofisolated or recombinant DNA, or fragments thereof, which encode proteinswhich exhibit high measures of identity to an F06B09 protein, or whichwere isolated, e.g., using cDNA encoding an F06B09 protease polypeptideas a probe. The isolated DNA can have the respective regulatorysequences in the 5′ and 3′ flanks, e.g., promoters, enhancers, poly-Aaddition signals, and others.

IV. Making Human F06B09 Proteins

DNAs which encode an F06B09 protein, or fragments thereof, can beobtained by chemical synthesis, screening cDNA libraries, or byscreening genomic libraries prepared from a wide variety of cell linesor tissue samples.

These DNAs can be expressed in a wide variety of host cells for thesynthesis of a full-length protein or fragments which can in turn, e.g.,be used to generate polyclonal or monoclonal antibodies; for bindingstudies; for construction and expression of modified molecules; and forstructure/function studies. Each of F06B09, or their fragments, can beexpressed in host cells that are transformed or transfected withappropriate expression vectors. These molecules can be substantiallypurified to be free of protein or cellular contaminants, other thanthose derived from the recombinant host, and therefore are particularlyuseful in pharmaceutical compositions when combined with apharmaceutically acceptable carrier and/or diluent. The antigen, e.g.,F06B09, or portions thereof, may be expressed as fusions with otherproteins or possessing an epitope tag.

Expression vectors are typically self-replicating DNA or RNA constructscontaining the desired antigen gene or its fragments, usually operablylinked to appropriate genetic control elements that are recognized in asuitable host cell. The specific type of control elements necessary toeffect expression will depend upon the eventual host cell used.Generally, the genetic control elements can include a prokaryoticpromoter system or a eukaryotic promoter expression control system, andtypically include a transcriptional promoter, an optional operator tocontrol the onset of transcription, transcription enhancers to elevatethe level of mRNA expression, a sequence that encodes a suitableribosome binding site, and sequences that terminate transcription andtranslation. Expression vectors also usually contain an origin ofreplication that allows the vector to replicate independently from thehost cell.

The vectors of this invention contain DNAs which encode an F06B09protein, or a significant fragment thereof, typically encoding, e.g., abiologically active polypeptide, or protein. The DNA can be under thecontrol of a viral promoter and can encode a selection marker. Thisinvention further contemplates use of such expression vectors which arecapable of expressing eukaryotic cDNA coding for an F06B09 protein in aprokaryotic or eukaryotic host, where the vector is compatible with thehost and where the eukaryotic cDNA coding for the protein is insertedinto the vector such that growth of the host containing the vectorexpresses the cDNA in question. Usually, expression vectors are designedfor stable replication in their host cells or for amplification togreatly increase the total number of copies of the desirable gene percell. It is not always necessary to require that an expression vectorreplicate in a host cell, e.g., it is possible to effect transientexpression of the protein or its fragments in various hosts usingvectors that do not contain a replication origin that is recognized bythe host cell. It is also possible to use vectors that cause integrationof an F06B09 protein gene or its fragments into the host DNA byrecombination, or to integrate a promoter which controls expression ofan endogenous gene.

Vectors, as used herein, contemplate plasmids, viruses, bacteriophage,integratable DNA fragments, and other vehicles which enable theintegration of DNA fragments into the genome of the host. Expressionvectors are specialized vectors which contain genetic control elementsthat effect expression of operably linked genes. Plasmids are the mostcommonly used form of vector, but many other forms of vectors whichserve an equivalent function are suitable for use herein. See, e.g.,Pouwels, et al. (1985 and Supplements) Cloning Vectors: A LaboratoryManual Elsevier, N.Y.; and Rodriguez, et al. (eds. 1988) Vectors: ASurvey of Molecular Cloning Vectors and Their Uses Buttersworth, Boston,Mass.

Suitable host cells include prokaryotes, lower eukaryotes, and highereukaryotes. Prokaryotes include both gram negative and gram positiveorganisms, e.g., E. coli and B. subtilis. Lower eukaryotes includeyeasts, e.g., S. cerevisiae and Pichia, and species of the genusDictyostelium. Higher eukaryotes include established tissue culture celllines from animal cells, both of non-mammalian origin, e.g., insectcells, and birds, and of mammalian origin, e.g., human, primates, androdents.

Prokaryotic host-vector systems include a wide variety of vectors formany different species. As used herein, E. coli and its vectors will beused generically to include equivalent vectors used in otherprokaryotes. A representative vector for amplifying DNA is pBR322 or itsderivatives. Vectors that can be used to express F06B09 proteins orfragments thereof include, but are not limited to, such vectors as thosecontaining the lac promoter (pUC-series); trp promoter (pBR322-trp); Ipppromoter (the pIN-series); lambda-pP or pR promoters (pOTS); or hybridpromoters such as ptac (pDR540). See Brosius, et al. (1988) “ExpressionVectors Employing Lambda-, trp-, lac-, and Ipp-derived Promoters”, inRodriguez and Denhardt (eds.) Vectors: A Survey of Molecular CloningVectors and Their Uses 10:205-236 Buttersworth, Boston, Mass.

Lower eukaryotes, e.g., yeasts and Dictyostelium, may be transformedwith F06B09 protein sequence containing vectors. For purposes of thisinvention, the most common lower eukaryotic host is the baker's yeast,Saccharomyces cerevisiae. It will be used generically to represent lowereukaryotes although a number of other strains and species will beessentially equivalent. Yeast vectors typically consist of a replicationorigin (unless of the integrating type), one or more selection genes, apromoter, DNA encoding the desired protein or its fragments, andsequences for translation termination, polyadenylation, andtranscription termination. Suitable expression vectors for yeast includesuch constitutive promoters as 3-phosphoglycerate kinase and variousother glycolytic enzyme gene promoters or such inducible promoters asthe alcohol dehydrogenase 2 promoter or metallothionine promoter.Suitable vectors include derivatives of the following types:self-replicating low copy number (such as the YRp-series),self-replicating high copy number (such as the YEp-series), integratingtypes (such as the YIp-series), or mini-chromosomes (such as theYCp-series).

Higher eukaryotic tissue culture cells are typically the preferred hostcells for expression of the functionally active F06B09 proteasepolypeptides. In principle, many higher eukaryotic tissue culture celllines may be used, e.g., insect baculovirus expression systems, whetherfrom an invertebrate or vertebrate source. However, mammalian cells arepreferred to achieve proper natural processing, both cotranslationallyand posttranslationally. Transformation or transfection and propagationof such cells are routine. Useful cell lines include HeLa cells, Chinesehamster ovary (CHO) cell lines, baby rat kidney (BRK) cell lines, insectcell lines, bird cell lines, and monkey (COS) cell lines. Expressionvectors for such cell lines usually include an origin of replication, apromoter, a translation initiation site, RNA splice sites (e.g., ifgenomic DNA is used), a polyadenylation site, and a transcriptiontermination site. These vectors also may contain selection and/oramplification genes. Suitable expression vectors may be plasmids,viruses, or retroviruses carrying promoters derived, e.g., from suchsources as from adenovirus, SV40, parvoviruses, vaccinia virus, orcytomegalovirus. Representative examples of suitable expression vectorsinclude pCDNA1; pCD, see Okayama, et al. (1985) Mol. Cell Biol.5:1136-1142; pMC1neo Poly-A, see Thomas, et al. (1987) Cell 51:503-512;and a baculovirus vector such as pAC 373 or pAC 610.

It is likely that F06B09 protein need not be glycosylated to elicitbiological responses. However, it will occasionally be desirable toexpress an F06B09 polypeptide in a system which provides a specific ordefined glycosylation pattern. In this case, the usual pattern will bethat provided naturally by the expression system. However, the patternwill be modifiable by exposing the polypeptide, e.g., in unglycosylatedform, to appropriate glycosylating proteins introduced into aheterologous expression system. For example, an F06B09 protein gene maybe co-transformed with one or more genes encoding mammalian or otherglycosylating enzymes. It is further understood that over glycosylationmay be detrimental to F06B09 protein biological activity, and that oneof skill may perform routine testing to optimize the degree ofglycosylation which confers optimal biological activity.

An F06B09 protein, or a fragment thereof, may be engineered to bephosphatidyl inositol (PI) linked to a cell membrane, but can be removedfrom membranes by treatment with a phosphatidyl inositol cleavingenzyme, e.g., phosphatidyl inositol phospholipase-C. This releases theantigen in a biologically active form, and allows purification bystandard procedures of protein chemistry. See, e.g., Low (1989) Biochem.Biophys. Acta 988:427-454; Tse, et al. (1985) Science 230:1003-1008; andBrunner, et al. (1991) J. Cell Biol. 114:1275-1283.

Now that F06B09 proteins have been characterized, fragments orderivatives thereof can be prepared by conventional processes forsynthesizing peptides. These include processes such as are described inStewart and Young (1984) Solid Phase Peptide Synthesis Pierce ChemicalCo., Rockford, Ill.; Bodanszky and Bodanszky (1984) The Practice ofPeptide Synthesis Springer-Verlag, New York, N.Y.; Bodanszky (1984) ThePrinciples of Peptide Synthesis Springer-Verlag, New York, N.Y.; andDawson, et al. (1994) Science 266:776-779. For example, an azideprocess, an acid chloride process, an acid anhydride process, a mixedanhydride process, an active ester process (for example, p-nitrophenylester, N-hydroxysuccinimide ester, or cyanomethyl ester), acarbodiimidazole process, an oxidative-reductive process, or adicyclohexylcarbodiimide (DCCD)/additive process can be used. Solidphase and solution phase syntheses are both applicable to the foregoingprocesses.

The prepared protein and fragments thereof can be isolated and purifiedfrom the reaction mixture by means of peptide separation, for example,by extraction, precipitation, electrophoresis and various forms ofchromatography, and the like. The F06B09 proteins of this invention canbe obtained in varying degrees of purity depending upon its desired use.Purification can be accomplished by use of known protein purificationtechniques or by the use of the antibodies or binding partners hereindescribed, e.g., in immunoabsorbant affinity chromatography. Thisimmunoabsorbant affinity chromatography is carried out, e.g., by firstlinking the antibodies to a solid support and then contacting the linkedantibodies with solubilized lysates of appropriate source cells, lysatesof other cells expressing the ligand, or lysates or supernatants ofcells producing the F06B09 proteins as a result of recombinant DNAtechniques, see below.

Multiple cell lines may be screened for one which expresses an F06B09polypeptide or protein at a high level compared with other cells.Various cell lines, e.g., a mouse thymic stromal cell line TA4, isscreened and selected for its favorable handling properties. NaturalF06B09 proteins can be isolated from natural sources, or by expressionfrom a transformed cell using an appropriate expression vector.Purification of the expressed protein is achieved by standardprocedures, or may be combined with engineered means for effectivepurification at high efficiency from cell lysates or supernatants.Epitope or other tags, e.g., FLAG or His₆ segments, can be used for suchpurification features.

V. Antibodies

Antibodies can be raised to various F06B09 proteins, includingindividual, polymorphic, allelic, strain, or species variants, andfragments thereof, both in their naturally occurring (full-length) formsand in their recombinant forms. Additionally, antibodies can be raisedto F06B09 proteins in either their native (or active) forms or in theirinactive, e.g., denatured, forms. Anti-idiotypic antibodies may also beused.

A. Antibody Production

A number of immunogens may be used to produce antibodies specificallyreactive with F06B09 proteins. Recombinant protein is a preferredimmunogen for the production of monoclonal or polyclonal antibodies.Naturally occurring protein may also be used either in pure or impureform. Synthetic peptides, made using the human F06B09 protein sequencesdescribed herein, may also used as an immunogen for the production ofantibodies to F06B09 proteins. Recombinant protein can be expressed ineukaryotic or prokaryotic cells as described herein, and purified asdescribed. Naturally folded or denatured material can be used, asappropriate, for producing antibodies. Either monoclonal or polyclonalantibodies may be generated, e.g., for subsequent use in immunoassays tomeasure the protein.

Methods of producing polyclonal antibodies are well known to those ofskill in the art. Typically, an immunogen, preferably a purifiedprotein, is mixed with an adjuvant and animals are immunized with themixture. The animal's immune response to the immunogen preparation ismonitored by taking test bleeds and determining the titer of reactivityto the F06B09 protein of interest. For example, when appropriately hightiters of antibody to the immunogen are obtained, usually after repeatedimmunizations, blood is collected from the animal and antisera areprepared. Further fractionation of the antisera to enrich for antibodiesreactive to the protein can be done if desired. See, e.g., Harlow andLane; or Coligan.

Monoclonal antibodies may be obtained by various techniques familiar tothose skilled in the art. Typically, spleen cells from an animalimmunized with a desired antigen are immortalized, commonly by fusionwith a myeloma cell (see, Kohler and Milstein (1976) Eur. J. Immunol.6:511-519, incorporated herein by reference). Alternative methods ofimmortalization include transformation with Epstein Barr Virus,oncogenes, or retroviruses, or other methods known in the art. Coloniesarising from single immortalized cells are screened for production ofantibodies of the desired specificity and affinity for the antigen, andyield of the monoclonal antibodies produced by such cells may beenhanced by various techniques, including injection into the peritonealcavity of a vertebrate host. Alternatively, one may isolate DNAsequences which encode a monoclonal antibody or a binding fragmentthereof by screening a DNA library from human B cells according, e.g.,to the general protocol outlined by Huse, et al. (1989) Science246:1275-1281.

Antibodies, including binding fragments and single chain versions,against predetermined fragments of F06B09 proteins can be raised byimmunization of animals with conjugates of the fragments with carrierproteins as described above. Monoclonal antibodies are prepared fromcells secreting the desired antibody. These antibodies can be screenedfor binding to normal or defective F06B09 protein, or screened foragonistic or antagonistic activity, e.g., mediated through a receptor.These monoclonal antibodies will usually bind with at least a K_(D) ofabout 1 mm, more usually at least about 300 μM, typically at least about100 μM, more typically at least about 30 μM, preferably at least about10 μM, and more preferably at least about 3 μM or better.

In some instances, it is desirable to prepare monoclonal antibodies fromvarious mammalian hosts, such as mice, rodents, primates, humans, etc.Description of techniques for preparing such monoclonal antibodies maybe found in, e.g., Stites, et al. (eds.) Basic and Clinical Immunology(4th ed.) Lange Medical Publications, Los Altos, Calif., and referencescited therein; Harlow and Lane (1988) Antibodies: A Laboratory ManualCSH Press; Goding (1986) Monoclonal Antibodies: Principles and Practice(2d ed.) Academic Press, New York, N.Y.; and particularly in Kohler andMilstein (1975) Nature 256:495-497, which discusses one method ofgenerating monoclonal antibodies. Summarized briefly, this methodinvolves injecting an animal with an immunogen. The animal is thensacrificed and cells taken from its spleen, which are then fused withmyeloma cells. The result is a hybrid cell or “hybridoma” that iscapable of reproducing in vitro. The population of hybridomas is thenscreened to isolate individual clones, each of which secrete a singleantibody species to the immunogen. In this manner, the individualantibody species obtained are the products of immortalized and clonedsingle B cells from the immune animal generated in response to aspecific site recognized on the immunogenic substance.

Other suitable techniques involve selection of libraries of antibodiesin phage or similar vectors. See, e.g., Huse, et al. (1989) “Generationof a Large Combinatorial Library of the Immunoglobulin Repertoire inPhage Lambda,” Science 246:1275-1281; and Ward, et al. (1989) Nature341:544-546. The polypeptides and antibodies of the present inventionmay be used with or without modification, including chimeric orhumanized antibodies. Frequently, the polypeptides and antibodies willbe labeled by joining, either covalently or non-covalently, a substancewhich provides for a detectable signal. A wide variety of labels andconjugation techniques are known and are reported extensively in boththe scientific and patent literature. Suitable labels includeradionuclides, enzymes, substrates, cofactors, inhibitors, fluorescentmoieties, chemiluminescent moieties, magnetic particles, and the like.Patents teaching the use of such labels include U.S. Pat. Nos.3,817,837; 3,850,752; 3,939,350; 3,996,345; 4,277,437; 4,275,149; and4,366,241. Also, recombinant immunoglobulins may be produced, see,Cabilly, U.S. Pat. No. 4,816,567; and Queen, et al. (1989) Proc. Nat'l.Acad. Sci. USA 86:10029-10033; or made in transgenic mice, see Mendez,et al. (1997) Nature Genetics 15:146-156.

The antibodies of this invention are useful for affinity chromatographyin isolating F06B09 protein. Columns can be prepared where theantibodies are linked to a solid support, e.g., particles, such asagarose, SEPHADEX, or the like, where a cell lysate or supernatant maybe passed through the column, the column washed, followed by increasingconcentrations of a mild denaturant, whereby purified F06B09 proteinwill be released. The converse can be performed using protein to isolatespecific antibodies.

Other antibodies may block enzymatic activity. The antibodies may alsobe used to screen expression libraries for particular expressionproducts. Usually the antibodies used in such a procedure will belabeled with a moiety allowing easy detection of presence of antigen byantibody binding.

Antibodies to F06B09 proteins may be used for the identification of cellpopulations expressing F06B09 protein. By assaying the expressionproducts of cells expressing F06B09 proteins it is possible to diagnosedisease, e.g., metabolic conditions. The proteins may also be markersfor specific tissue or cell subpopulations, e.g., dendritic cells.

Antibodies raised against each F06B09 protein will also be useful toraise anti-idiotypic antibodies. These will be useful in detecting ordiagnosing various immunological conditions related to expression of therespective antigens.

B. Immunoassays

A particular protein can be measured by a variety of immunoassaymethods. For a review of immunological and immunoassay procedures ingeneral, see Stites and Terr (eds. 1991) Basic and Clinical Immunology(7th ed.). Moreover, the immunoassays of the present invention can beperformed in many configurations, which are reviewed extensively inMaggio (ed. 1980) Enzyme Immunoassay CRC Press, Boca Raton, Fla.; Tijan(1985) “Practice and Theory of Enzyme Immunoassays,” LaboratoryTechniques in Biochemistry and Molecular Biology, Elsevier SciencePublishers B.V., Amsterdam; and Harlow and Lane Antibodies, A LaboratoryManual, supra, each of which is incorporated herein by reference. Seealso Chan (ed. 1987) Immunoassay: A Practical Guide Academic Press,Orlando, Fla.; Price and Newman (eds. 1991) Principles and Practice ofImmunoassays Stockton Press, NY; and Ngo (ed. 1988) Non-isotopicImmunoassays Plenum Press, NY.

Immunoassays for measurement of F06B09 proteins or peptides can beperformed by a variety of methods known to those skilled in the art. Inbrief, immunoassays to measure the protein can be either competitive ornoncompetitive binding assays. In competitive binding assays, the sampleto be analyzed competes with a labeled analyte for specific bindingsites on a capture agent bound to a solid surface. Preferably thecapture agent is an antibody specifically reactive with F06B09 proteinsproduced as described above. The concentration of labeled analyte boundto the capture agent is inversely proportional to the amount of freeanalyte present in the sample.

In a competitive binding immunoassay, the F06B09 protein present in thesample competes with labeled protein for binding to a specific bindingagent, for example, an antibody specifically reactive with the F06B09protein. The binding agent may be bound to a solid surface to effectseparation of bound labeled protein from the unbound labeled protein.Alternately, the competitive binding assay may be conducted in liquidphase and a variety of techniques known in the art may be used toseparate the bound labeled protein from the unbound labeled protein.Following separation, the amount of bound labeled protein is determined.The amount of protein present in the sample is inversely proportional tothe amount of labeled protein binding.

Alternatively, a homogeneous immunoassay may be performed in which aseparation step is not needed. In these immunoassays, the label on theprotein is altered by the binding of the protein to its specific bindingagent. This alteration in the labeled protein results in a decrease orincrease in the signal emitted by label, so that measurement of thelabel at the end of the immunoassay allows for detection or quantitationof the protein.

F06B09 proteins may also be determined by a variety of noncompetitiveimmunoassay methods. For example, a two-site, solid phase sandwichimmunoassay may be used. In this type of assay, a binding agent for theprotein, for example an antibody, is attached to a solid support. Asecond protein binding agent, which may also be an antibody, and whichbinds the protein at a different site, is labeled. After binding at bothsites on the protein has occurred, the unbound labeled binding agent isremoved and the amount of labeled binding agent bound to the solid phaseis measured. The amount of labeled binding agent bound is directlyproportional to the amount of protein in the sample.

Western blot analysis can be used to determine the presence of F06B09proteins in a sample. Electrophoresis is carried out, for example, on atissue sample suspected of containing the protein. Followingelectrophoresis to separate the proteins, and transfer of the proteinsto a suitable solid support, e.g., a nitrocellulose filter, the solidsupport is incubated with an antibody reactive with the protein. Thisantibody may be labeled, or alternatively may be detected by subsequentincubation with a second labeled antibody that binds the primaryantibody.

The immunoassay formats described above may employ labeled assaycomponents. The label may be coupled directly or indirectly to thedesired component of the assay according to methods well known in theart. A wide variety of labels and methods may be used. Traditionally, aradioactive label incorporating ³H, ¹²⁵I, ³⁵S, ¹⁴C, or ³²P was used.Non-radioactive labels include ligands which bind to labeled antibodies,fluorophores, chemiluminescent agents, enzymes, and antibodies which canserve as specific binding pair members for a labeled ligand. The choiceof label depends on sensitivity required, ease of conjugation with thecompound, stability requirements, and available instrumentation. For areview of various labeling or signal producing systems which-may beused, see U.S. Pat. No. 4,391,904, which is incorporated herein byreference.

Antibodies reactive with a particular protein can also be measured by avariety of immunoassay methods. For a review of immunological andimmunoassay procedures applicable to the measurement of antibodies byimmunoassay techniques, see Stites and Terr (eds.) Basic and ClinicalImmunology (7th ed.) supra; Maggio (ed.) Enzyme Immunoassay, supra; andHarlow and Lane Antibodies, A Laboratory Manual, supra.

In brief, inmunoassays to measure antisera reactive with F06B09 proteinscan be either competitive or noncompetitive binding assays. Incompetitive binding assays, the sample analyte competes with a labeledanalyte for specific binding sites on a capture agent bound to a solidsurface. Preferably the capture agent is a purified recombinant F06B09protein produced as described above. Other sources of F06B09 proteins,including isolated or partially purified naturally occurring protein,may also be used. Noncompetitive assays include sandwich assays, inwhich the sample analyte is bound between two analyte-specific bindingreagents. One of the binding agents is used as a capture agent and isbound to a solid surface. The second binding agent is labeled and isused to measure or detect the resultant complex by visual or instrumentmeans. A number of combinations of capture agent and labeled bindingagent can be used. A variety of different immunoassay formats,separation techniques, and labels can be also be used similar to thosedescribed above for the measurement of F06B09 proteins. Similar methodsmay be used to evaluate or quantitate specific binding compounds.

VI. Purified F06B09 Proteins

Human F06B09 protein amino acid sequences are provided in Table 1 andSEQ ID NO: 2 or 4.

Purified protein or defined peptides are useful for generatingantibodies by standard methods, as described above. Synthetic peptidesor purified protein can be presented to an immune system to generatepolyclonal and monoclonal antibodies. See, e.g., Coligan (1991) CurrentProtocols in Immunology Wiley/Greene, NY; and Harlow and Lane (1989)Antibodies: A Laboratory Manual Cold Spring Harbor Press, NY, which areincorporated herein by reference.

The specific binding composition can be used for screening an expressionlibrary made from a cell line which expresses an F06B09 protein. Manymethods for screening are available, e.g., standard staining of surfaceexpressed ligand, or by panning. Screening of intracellular expressioncan also be performed by various staining or immunofluorescenceprocedures. The binding compositions could be used to affinity purify orsort out cells expressing the ligand.

The peptide segments, along with comparison to homologous genes, canalso be used to produce appropriate oligonucleotides to screen alibrary. The genetic code can be used to select appropriateoligonucleotides useful as probes for screening. In combination withpolymerase chain reaction (PCR) techniques, synthetic oligonucleotideswill be useful in selecting desired clones from a library, includingnatural allelic and polymorphic variants.

The peptide sequences allow preparation of peptides to generateantibodies to recognize such segments, and allow preparation ofoligonucleotides which encode such sequences. The sequence also allowsfor synthetic preparation, e.g., see Dawson, et al. (1994) Science266:776-779. Analysis of the structural features in comparison with themost closely related reported sequences has revealed similarities withother proteins, particularly the class of proteins known as proteases.

VII. Physical Variants

This invention also encompasses proteins or peptides having substantialamino acid sequence similarity with an amino acid sequence of an F06B09protein. Natural variants include individual, polymorphic, allelic,strain, or species variants. Conservative substitutions in the aminoacid sequence will normally preserve most relevant biologicalactivities. In particular, various substitutions can be made, e.g.,embodiments with 10-fold substitutions, 7-fold substitutions, 5-foldsubstitutions, 3-fold substitutions, 2-fold, and etc. Such embodimentswill typically retain particular features, e.g., antigenicity, with thenatural forms.

Amino acid sequence similarity, or sequence identity, is determined byoptimizing residue matches, if necessary, by introducing gaps asrequired. This changes when considering conservative substitutions asmatches. Conservative substitutions typically include substitutionswithin the following groups: glycine, alanine; valine, isoleucine,leucine; aspartic acid, glutamic acid; asparagine, glutamine; serine,threonine; lysine, arginine; and phenylalanine, tyrosine. Homologousamino acid sequences include natural polymorphic, allelic, andinterspecies variations in each respective protein sequence. Typicalhomologous proteins or peptides will have from 50-100% similarity (ifgaps can be introduced), to 75-100% similarity (if conservativesubstitutions are included) with the amino acid sequence of the F06B09protein. Similarity measures will be at least about 50%, generally atleast about 60%, more generally at least about 65%, usually at leastabout 70%, more usually at least about 75%, preferably at least about80%, and more preferably at least about 80%, and in particularlypreferred embodiments, at least about 85% or more. See also Needleham,et al. (1970) J. Mol. Biol. 48:443-453; Sankoff, et al. (1983) TimeWarps, String Edits, and Macromolecules: The Theory and Practice ofSequence Comparison Chapter One, Addison-Wesley, Reading, Mass.; andsoftware packages from IntelliGenetics, Mountain View, Calif.; and theUniversity of Wisconsin Genetics Computer Group, Madison, Wis.

Natural nucleic acids encoding mammalian F06B09 proteins will typicallyhybridize to the nucleic acid sequence of SEQ ID NO: 1 or 3 understringent conditions. For example, nucleic acids encoding human F06B09proteins will normally hybridize to the nucleic acid of SEQ ID NO: 1 or3 under stringent hybridization conditions. Generally stringentconditions are selected to be about 10° C. lower than the thermalmelting point (Tm) for the probe sequence at a defined ionic strengthand pH. The Tm is the temperature (under defined ionic strength and pH)at which 50% of the target sequence hybridizes to a perfectly matchedprobe. Typically, stringent conditions will be those in which the saltconcentration is about 0.2 molar at pH 7 and the temperature is at leastabout 50° C. Other factors may significantly affect the stringency ofhybridization, including, among others, base composition and size of thecomplementary strands, the presence of organic solvents such asformamide, and the extent of base mismatching. A preferred embodimentwill include nucleic acids which will bind to disclosed sequences in 50%formamide and 200 mM NaCl at 42° C. See, e.g., Sambrook, et al.

An isolated F06B09 protein DNA can be readily modified by nucleotidesubstitutions, nucleotide deletions, nucleotide insertions, and shortinversions of nucleotide stretches. These modifications result in novelDNA sequences which encode F06B09 protein antigens, their derivatives,or proteins having highly similar physiological, immunogenic, orantigenic activity.

Modified sequences can be used to produce mutant antigens or to enhanceexpression. Enhanced expression may involve gene amplification,increased transcription, increased translation, and other mechanisms.Such mutant F06B09 protein derivatives include predetermined orsite-specific mutations of the respective protein or its fragments.“Mutant F06B09 protein” encompasses a polypeptide otherwise fallingwithin the homology definition of the human F06B09 protein as set forthabove, but having an amino acid sequence which differs from that of anF06B09 protein as found in nature, whether by way of deletion,substitution, or insertion. In particular, “site specific mutant F06B09protein” generally includes proteins having significant similarity witha protein having a sequence of SEQ ID NO: 2 or 4, and as sharing variousbiological activities, e.g., antigenic or immunogenic, with thosesequences, and in preferred embodiments contain most or all of thedisclosed sequence. This applies also to polymorphic variants fromdifferent individuals. Similar concepts apply to different F06B09proteins, particularly those found in various warm blooded animals,e.g., mammals and birds. As stated before, it is emphasized thatdescriptions are generally meant to encompass other F06B09 proteins, notlimited to the human embodiments specifically discussed.

Although site specific mutation sites are predetermined, mutants neednot be site specific. F06B09 protein mutagenesis can be conducted bymaking amino acid insertions or deletions. Substitutions, deletions,insertions, or combinations may be generated to arrive at a finalconstruct. Insertions include amino- or carboxyl-terminal fusions, e.g.epitope tags. Random mutagenesis can be conducted at a target codon andthe expressed mutants can then be screened for the desired activity.Methods for making substitution mutations at predetermined sites in DNAhaving a known sequence are well known in the art, e.g., by M13 primermutagenesis or polymerase chain reaction (PCR) techniques. See also,Sambrook, et al. (1989) and Ausubel, et al. (1987 and Supplements). Themutations in the DNA normally should not place coding sequences out ofreading frames and preferably will not create-complementary regions thatcould hybridize to produce secondary mRNA structure such as loops orhairpins.

The present invention also provides recombinant proteins, e.g.,heterologous fusion proteins using segments from these proteins. Aheterologous fusion protein is a fusion of proteins or segments whichare naturally not normally fused in the same manner. Thus, the fusionproduct of an immunoglobulin with an F06B09 protein polypeptide is acontinuous protein molecule having sequences fused in a typical peptidelinkage, typically made as a single translation product and exhibitingproperties derived from each source peptide. A similar concept appliesto heterologous nucleic acid sequences. Antibody fusion proteins arealso contemplated.

In addition, new constructs may be made from combining similarfunctional domains from other proteins. For example, protein-binding orother segments may be “swapped” between different new fusionpolypeptides or fragments. See, e.g., Cunningham, et al. (1989) Science243:1330-1336; and O'Dowd, et al. (1988) J. Biol. Chem. 263:15985-15992.Thus, new chimeric polypeptides exhibiting new combinations ofspecificities will result from the functional linkage of protein-bindingspecificities and other functional domains.

VIII. Binding Agent:F06B09 Protein Complexes

An F06B09 protein that specifically binds to or that is specificallyimmunoreactive with an antibody generated against a defined immunogen,such as an immunogen consisting of the amino acid sequence of SEQ ID NO:2 or 4, is typically determined in an immunoassay. The immunoassay usesa polyclonal antiserum which was raised to a protein of SEQ ID NO: 2 or4. This antiserum is selected to have low crossreactivity against otherproteases and any such crossreactivity is removed by immunoabsorptionprior to use in the immunoassay.

In order to produce antisera for use in an immunoassay, the protein ofSEQ ID NO: 2 or 4 is isolated as described herein. For example,recombinant protein may be produced in a mammalian cell line. An inbredstrain of mice such as Balb/c is immunized with the protein of SEQ IDNO: 2 or 4 using a standard adjuvant, such as Freund's adjuvant, and astandard mouse immunization protocol (see Harlow and Lane, supra).Alternatively, a synthetic peptide, preferably near full length, derivedfrom the sequences disclosed herein and conjugated to a carrier proteincan be used an immunogen. Polyclonal sera are collected and titeredagainst the immunogen protein in an immunoassay, for example, a solidphase immunoassay with the immunogen immobilized on a solid support.Polyclonal antisera with a titer of 10⁴ or greater are selected andtested for their cross reactivity against other proteases, e.g., using acompetitive binding immunoassay such as the one described in Harlow andLane, supra, at pages 570-573. Preferably two related proteins are usedin this determination in conjunction with either F06B09 protein.

Immunoassays in the competitive binding format can be used for thecrossreactivity determinations. For example, a protein of SEQ ID NO: 4can be immobilized to a solid support. Proteins added to the assaycompete with the binding of the antisera to the immobilized antigen. Theability of the above proteins to compete with the binding of theantisera to the immobilized protein is compared to the protein of SEQ IDNO: 4. The percent crossreactivity for the above proteins is calculated,using standard calculations. Those antisera with less than 10%crossreactivity with each of the proteins listed above are selected andpooled. The cross-reacting antibodies are then removed from the pooledantisera by immunoabsorption with the above-listed proteins.

The immunoabsorbed and pooled antisera are then used in a competitivebinding immunoassay as described above to compare a second protein tothe immunogen protein (e.g., the protein motif of SEQ ID NO: 4). Inorder to make this comparison, the two proteins are each assayed at awide range of concentrations and the amount of each protein required toinhibit 50% of the binding of the antisera to the immobilized protein isdetermined. If the amount of the second protein required is less thantwice the amount of the protein of SEQ ID NO: 4 that is required, thenthe second protein is said to specifically bind to an antibody generatedto the immunogen.

It is understood that F06B09 proteins are families of homologousproteins that comprise two or more genes. For a particular gene product,such as the human F06B09 proteins, the term refers not only to the aminoacid sequences disclosed herein, but also to other proteins that arepolymorphic, allelic, non-allelic, or species variants or equivalents.It is also understood that the term “human F06B09 protein” includesequivalent proteins, e.g., nonnatural mutations introduced by deliberatemutation using conventional recombinant technology such as single sitemutation, or by excising short sections of DNA encoding F06B09 proteins,or by substituting new amino acids, or adding new amino acids. Suchminor alterations must substantially maintain the immunoidentity of theoriginal molecule and/or its biological activity. Thus, thesealterations include proteins that are specifically immunoreactive with adesignated naturally occurring F06B09 protein, for example, the humanF06B09 protein shown in SEQ ID NO: 4. The biological properties of thealtered proteins can be determined by expressing the protein in anappropriate cell line and measuring, e.g., enzymatic activity underappropriate conditions. Particular protein modifications consideredminor would include conservative substitution of amino acids withsimilar chemical properties, as described above for F06B09 proteinfamilies as a whole. By aligning a protein optimally with the protein ofSEQ ID NO: 4, and by using the conventional immunoassays describedherein to determine immunoidentity, or by using lymphocyte chemotaxisassays, one can determine the protein compositions of the invention.

IX. Functional Variants

The blocking of physiological response to F06B09 protein may result fromthe inhibition of enzymatic activity of the protein against itssubstrate, e.g., through competitive inhibition. Thus, in vitro assaysof the present invention will often use isolated protein, membranes fromcells expressing a recombinant membrane associated proteins, solublefragments comprising enzymatically active segments of these proteins, orfragments attached to solid phase substrates. These assays will alsoallow for the diagnostic determination of the effects of either bindingsegment mutations and modifications, or protein mutations andmodifications, e.g., protein analogs. This invention also contemplatesthe use of competitive drug screening assays, e.g., where neutralizingantibodies to antigen or enzyme fragments compete with a test compoundfor binding to the protein. In this manner, the antibodies can be usedto detect the presence of a polypeptide which shares one or moreantigenic binding sites of the protein and can also be used to occupybinding sites on the protein that might otherwise interact with, e.g.,substrate.

“Derivatives” of F06B09 proteins include amino acid sequence mutants,glycosylation variants, and covalent or aggregate conjugates with otherchemical moieties. Covalent derivatives can be prepared by linkage offunctionalities to groups which are found in F06B09 protein amino acidside chains or at the N- or C- termini, e.g., by means which are wellknown in the art. These derivatives can include, without limitation,aliphatic esters or amides of the carboxyl terminus, or of residuescontaining carboxyl side chains, 0-acyl derivatives of hydroxylgroup-containing residues, and N-acyl derivatives of the amino terminalamino acid or amino-group containing residues, e.g., lysine or arginine.Acyl groups are typically selected from the group of alkyl-moietiesincluding, e.g., C3 to C18 normal alkyl, thereby forming alkanoyl aroylspecies. Covalent attachment to carrier proteins may be important whenimmunogenic moieties are haptens.

In particular, glycosylation alterations are included, e.g., made bymodifying the glycosylation patterns of a polypeptide during itssynthesis and processing, or in further processing steps. Particularlypreferred means for accomplishing this are by exposing the polypeptideto glycosylating enzymes derived from cells which normally provide suchprocessing, e.g., mammalian glycosylation enzymes. Deglycosylationenzymes are also contemplated. Also embraced are versions of the sameprimary amino acid sequence which have other minor modifications,including phosphorylated amino acid residues, e.g., phosphotyrosine,phosphoserine, or phosphothreonine, or other moieties, including ribosylgroups or cross-linking reagents.

A major group of derivatives are covalent conjugates of the F06B09protein or fragments thereof with other proteins or polypeptides. Thesederivatives can be synthesized in recombinant culture such as N- orC-terminal fusions or by the use of agents known in the art for theirusefulness in cross-linking proteins through reactive side groups.Preferred protein derivatization sites with cross-linking agents are atfree amino groups, carbohydrate moieties, and cysteine residues.

Fusion polypeptides between human F06B09 proteins and other homologousor heterologous proteins are also provided. Heterologous polypeptidesmay be fusions between different related proteins, resulting in, e.g., ahybrid protein exhibiting modified substrate or other bindingspecificity. Likewise, heterologous fusions may be constructed whichwould exhibit a combination of properties or activities of thederivative proteins. Typical examples are fusions of a reporterpolypeptide, e.g., luciferase, with a segment or domain of a protein,e.g., a receptor-binding segment, so that the presence or location ofthe fused protein may be easily determined. See, e.g., Dull, et al.,U.S. Pat. No. 4,859,609. Other gene fusion partners include bacterialβ-galactosidase, trpE, Protein A, β-lactamase, alpha amylase, alcoholdehydrogenase, and yeast alpha mating factor. See, e.g., Godowski, etal. (1988) Science 241:812-816.

Such polypeptides may also have amino acid residues which have beenchemically modified by phosphorylation, sulfonation, biotinylation, orthe addition or removal of other moieties, particularly those which havemolecular shapes similar to phosphate groups. In some embodiments, themodifications will be useful labeling reagents, or serve as purificationtargets, e.g., affinity ligands.

This invention also contemplates the use of derivatives of F06B09proteins other than variations in amino acid sequence or glycosylation.Such derivatives may involve covalent or aggregative association withchemical moieties. These derivatives generally include the threeclasses: (1) salts, (2) side chain and terminal residue covalentmodifications, and (3) adsorption complexes, for example with cellmembranes. Such covalent or aggregative derivatives are useful asimmunogens, as reagents in immunoassays, or in purification methods suchas for affinity purification of ligands or other binding ligands. Forexample, an F06B09 protein can be immobilized by covalent bonding to asolid support such as cyanogen bromide-activated SEPHAROSE, by methodswhich are well known in the art, or adsorbed onto polyolefin surfaces,with or without glutaraldehyde cross-linking, for use in the assay orpurification of anti-F06B09 protein antibodies. The F06B09 proteins canalso be labeled with a detectable group, e.g., radioiodinated by thechloramine T procedure, covalently bound to rare earth chelates, orconjugated to another fluorescent moiety for use in diagnostic assays.Purification of F06B09 proteins may be effected by immobilizedantibodies or substrate.

Isolated F06B09 protein genes will allow transformation of cells lackingexpression of corresponding F06B09 proteins, e.g., either species typesor cells which lack corresponding proteins and exhibit negativebackground activity. Expression of transformed genes will allowisolation of antigenically pure cell lines, with defined or singlespecie variants. This approach will allow for more sensitive detectionand discrimination of the physiological effects of F06B09 proteinsubstrate proteins. Subcellular fragments, e.g., cytoplasts or membranefragments, can be isolated and used.

X. Uses

The present invention provides reagents which will find use indiagnostic applications as described elsewhere herein, e.g., in thegeneral description for metabolic abnormalities, or below in thedescription of kits for diagnosis.

F06B09 protein nucleotides, e.g., human F06B09 protein DNA or RNA, maybe used as a component in a forensic assay. For instance, the nucleotidesequences provided may be labeled using, e.g., ³²P or biotin and used toprobe standard restriction fragment polymorphism blots, providing ameasurable character to aid in distinguishing between individuals. Suchprobes may be used in well-known forensic techniques such as geneticfingerprinting. In addition, nucleotide probes made from F06B09 proteinsequences may be used in in situ assays to detect chromosomalabnormalities.

Antibodies and other binding agents directed towards F06B09 proteins ornucleic acids may be used to purify the corresponding F06B09 proteinmolecule. As described in the Examples below, antibody purification ofF06B09 protein components is both possible and practicable. Antibodiesand other binding agents may also be used in a diagnostic fashion todetermine whether F06B09 protein components are present in a tissuesample or cell population using well-known techniques described herein.The ability to attach a binding agent to an F06B09 protein provides ameans to diagnose disorders associated with F06B09 proteinmisregulation. Antibodies and other F06B09 protein binding agents mayalso be useful as histological or sorting markers. As described in theexamples below, F06B09 protein expression is limited to specific tissuetypes. By directing a probe, such as an antibody or nucleic acid to anF06B09 protein, it is possible to use the probe to distinguish tissueand cell types in situ or in vitro.

This invention also provides reagents with significant therapeuticvalue. The F06B09 protein (naturally occurring or recombinant),fragments thereof, and antibodies thereto, along with compoundsidentified as having binding affinity to an F06B09 protein, are usefulin the treatment of conditions associated with abnormal metabolism,physiology, or development, including abnormal immune responsiveness ornon-responsiveness. Abnormal proliferation, regeneration, degeneration,and atrophy may be modulated by appropriate therapeutic treatment usingthe compositions provided herein. The F06B09 proteins likely play rolesin regulation or development of hematopoietic cells, e.g., lymphoidcells, which affect immunological responses.

Thus, for example, an antagonist of an F06B09 protein could be useful inblocking the conversion of an immature or inactive immunologicallyrelevant pro-protein to the mature or active form. Since the F06B09proteases were derived from dendritic cells, antagonists could also beimportant in preventing antigen processing and/or subsequentpresentation. In addition, effects on DC migration or dendrite extensionbetween cells may result. One potential therapeutic application ofF06B09 would be to block this protease in inflammatory processesinvolving the dendritic cells (DC). The blocking could occur on theF06B09 MMP itself or on other molecules interacting with it.

Other abnormal developmental conditions are known in cell types shown topossess F06B09 protein encoding mRNA by northern blot analysis. SeeBerkow (ed.) The Merck Manual of Diagnosis and Therapy. Merck & Co.,Rahway, N.J.; and Thorn, et al. Harrison's Principles of InternalMedicine. McGraw-Hill, N.Y. Developmental or functional abnormalities,e.g., of the immune system, cause significant medical abnormalities andconditions which may be susceptible to prevention or treatment usingcompositions provided herein.

Recombinant F06B09 protein antibodies can be purified and thenadministered to a patient. These reagents can be combined fortherapeutic use with additional active or inert ingredients, e.g., inconventional pharmaceutically acceptable carriers or diluents, e.g.,immunogenic adjuvants, along with physiologically innocuous stabilizersand excipients. These combinations can be sterile filtered and placedinto dosage forms as by lyophilization in dosage vials or storage instabilized aqueous preparations. This invention also contemplates use ofantibodies or binding fragments thereof, including forms which are notcomplement binding.

Drug screening using antibodies or fragments thereof can identifycompounds having binding affinity to F06B09 protein, including isolationof associated components. Various substrate candidates can be screened.Subsequent biological assays can then be utilized to determine if thecompound has intrinsic enzyme blocking activity. Likewise, a compoundhaving intrinsic stimulating activity might activate the activity of anF06B09 protein. This invention further contemplates the therapeutic useof antibodies to F06B09 protein as antagonists. This approach should beparticularly useful with other F06B09 protein polymorphic or speciesvariants.

The quantities of reagents necessary for effective therapy will dependupon many different factors, including means of administration, targetsite, physiological state of the patient, and other medicantsadministered. Thus, treatment dosages should be titrated to optimizesafety and efficacy. Typically, dosages used in vitro may provide usefulguidance in the amounts useful for in situ administration of thesereagents. Animal testing of effective doses for treatment of particulardisorders will provide further predictive indication of human dosage.Various considerations are described, e.g., in Gilman, et al. (eds.1990) Goodman and Gilman's: The Pharmacological Bases of Therapeutics(8th ed.) Pergamon Press; and (1990) Remington's Pharmaceutical Sciences(17th ed.) Mack Publishing Co., Easton, Pa. Methods for administrationare discussed therein and below, e.g., for oral, intravenous,intraperitoneal, or intramuscular administration, transdermal diffusion,and others. Pharmaceutically acceptable carriers will include water,saline, buffers, and other compounds described, e.g., in the MerckIndex, Merck & Co., Rahway, N.J. Dosage ranges would ordinarily beexpected to be in amounts lower than 1 mM concentrations, typically lessthan about 10 μM concentrations, usually less than about 100 nM,preferably less than about 10 pM (picomolar), and most preferably lessthan about 1 fM (femtomolar), with an appropriate carrier. Slow releaseformulations, or a slow release apparatus will often be utilized forcontinuous administration.

F06B09 proteins, fragments thereof; antibodies to it or its fragments;antagonists; and agonists, may be administered directly to the host tobe treated or, depending on the size of the compounds, it may bedesirable to conjugate them to carrier proteins such as ovalbumin orserum albumin prior to their administration. Therapeutic formulationsmay be administered in any conventional dosage formulation. While it ispossible for the active ingredient to be administered alone, it ispreferable to present it as a pharmaceutical formulation. Formulationstypically comprise at least one active ingredient, as defined above,together with one or more acceptable carriers thereof. Each carriershould be both pharmaceutically and physiologically acceptable in thesense of being compatible with the other ingredients and not injuriousto the patient. Formulations include those suitable for oral, rectal,nasal, or parenteral (including subcutaneous, intramuscular, intravenousand intradermal) administration. The formulations may conveniently bepresented in unit dosage form and may be prepared by any methods wellknown in the art of pharmacy. See, e.g., Gilman, et al. (eds. 1990)Goodman and Gilman's: The Pharmacological Bases of Therapeutics (8thed.) Pergamon Press; and (1990) Remington's Pharmaceutical Sciences(17th ed.) Mack Publishing Co., Easton, Pa.; Avis, et al. (eds. 1993)Pharmaceutical Dosage Forms: Parenteral Medications Dekker, N.Y.;Lieberman, et al. (eds. 1990) Pharmaceutical Dosage Forms: TabletsDekker, N.Y.; and Lieberman, et al. (eds. 1990) Pharmaceutical DosageForms: Disperse Systems Dekker, N.Y. The therapy of this invention maybe combined with or used in association with other therapeutic agents.

Both the naturally occurring and the recombinant forms of the F06B09proteins of this invention are particularly useful in kits and assaymethods which are capable of screening compounds for binding activity tothe proteins, including substrates or competitive inhibitors. Severalmethods of automating assays have been developed in recent years so asto permit screening of tens of thousands of compounds in a short period.See, e.g., Fodor, et al. (1991) Science 251:767-773, and otherdescriptions of chemical diversity libraries, which describe means fortesting of binding affinity by a plurality of compounds. The developmentof suitable assays can be greatly facilitated by the availability oflarge amounts of purified, soluble F06B09 protein as provided by thisinvention.

For example, antagonists or inhibitors can normally be found once theprotein has been structurally defined. Testing of potential substratesor analogs is now possible upon the development of highly automatedassay methods using a purified enzyme. In particular, new agonists andantagonists will be discovered by using screening techniques describedherein. Of particular importance are compounds found to have a combinedblockage activity for multiple F06B09 protein substrates, e.g.,compounds which can serve as antagonists for polymorphic or speciesvariants of an F06B09 protein. Inhibitors can be identified, which maybe useful as therapeutic entities.

This invention is particularly useful for screening compounds by usingrecombinant protein in a variety of drug screening techniques. Theadvantages of using a recombinant protein in screening for specificinhibitors include: (a) improved renewable source of the F06B09 proteinfrom a specific source; (b) potentially greater number of molecules percell giving better signal to noise ratio in assays; and (c) speciesvariant specificity (theoretically giving greater biological and diseasespecificity).

One method of drug screening utilizes eukaryotic or prokaryotic hostcells which are stably transformed with recombinant DNA moleculesexpressing an F06B09 protein substrate. Cells may be isolated whichexpress a substrate in isolation from any others. Such cells, either inviable or fixed form, can be used for standard enzyme/substrate cleavageassays. See also, Parce, et al. (1989) Science 246:243-247; and Owicki,et al. (1990) Proc. Nat'l. Acad. Sci. USA 87:4007-4011, which describesensitive methods to detect cellular responses. Competitive assays areparticularly useful, where the cells (source of F06B09 protein) orhomogenates are contacted and incubated with a labeled antibody havingknown binding affinity to the protein, such as ¹²⁵I-antibody, and a testsample whose binding affinity to the binding composition is beingmeasured. The bound and free labeled binding compositions are thenseparated to assess the degree of antigen binding. The amount of testcompound bound is inversely proportional to the amount of labeledreagent binding to the known source. Any one of numerous techniques canbe used to separate bound from free antigen to assess the degree ofligand binding. This separation step could typically involve a proceduresuch as adhesion to filters followed by washing, adhesion to plasticfollowed by washing, or centrifugation of the cell membranes. Viablecells could also be used to screen for the effects of drugs on F06B09protein mediated functions, e.g., proprotein activation, substratecleavage, and others. Some detection methods allow for elimination of aseparation step, e.g., a proximity sensitive detection system.

Another method utilizes solubilized, unpurified or solubilized, purifiedF06B09 protein from transformed eukaryotic or prokaryotic host cells.This allows for a “molecular” binding assay with the advantages ofincreased specificity, the ability to automate, and high drug testthroughput.

Another technique for drug screening involves an approach which provideshigh throughput screening for compounds having suitable binding affinityto an F06B09 protein, e.g., an antibody, is described in detail inGeysen, European Patent Application 84/03564, published on Sep. 13,1984. First, large numbers of different small peptide test compounds aresynthesized on a solid substrate, e.g., plastic pins or some otherappropriate surface, see Fodor, et al., supra. Then all the pins arereacted with solubilized, unpurified or solubilized, purified F06B09protein antibody, and washed. The next step involves detecting boundF06B09 protein antibody.

Rational drug design may also be based upon structural studies of themolecular shapes of the F06B09 protein and other effectors or analogs.See, e.g., Methods in Enzymology vols. 202 and 203. Effectors may beother proteins which mediate other functions in response to antigenbinding, or other proteins which normally interact with the substrate.One means for determining which sites interact with specific otherproteins is a physical structure determination, e.g., x-raycrystallography or 2 dimensional NMR techniques. These will provideguidance as to which amino acid residues form molecular contact regions.For a detailed description of protein structural determination, see,e.g., Blundell and Johnson (1976) Protein Crystallography AcademicPress, NY.

A purified F06B09 protein can be coated directly onto plates for use inthe aforementioned drug screening techniques. However, non-neutralizingantibodies to these antigens can be used as capture antibodies toimmobilize the respective antigen on the solid phase. Candidates forscreening include for hybridomas, to find clones with desired bindingspecificity, or for inhibitors, e.g., of enzymatic activity.

XI. Kits

This invention also contemplates use of F06B09 proteins, fragmentsthereof, peptides, and their fusion products in a variety of diagnostickits and methods for detecting the presence of F06B09 protein or anF06B09 protein substrate. Typically the kit will have a compartmentcontaining either a defined F06B09 peptide or gene segment or a reagentwhich recognizes one or the other, e.g., substrates or antibodies.

A kit for determining the-binding affinity of a test compound to anF06B09 protein would typically comprise a test compound; a labeledcompound, e.g., an antibody having known binding affinity for the F06B09protein; a source of F06B09 protein (naturally occurring orrecombinant); and a means for separating bound from free labeledcompound, such as a solid phase for immobilizing the F06B09 protein.Once compounds are screened, those having suitable binding affinity tothe F06B09 protein can be evaluated in suitable biological assays, asare well known in the art, to determine whether they act as agonists orantagonists to a substrate. The availability of recombinant F06B09polypeptides also provide well defined standards for calibrating suchassays.

A preferred kit for determining the concentration of, for example, anF06B09 protein in a sample would typically comprise a labeled compound,e.g., antibody, having known binding affinity for the F06B09 protein, asource of F06B09 protein (naturally occurring or recombinant), and ameans for separating the bound from free labeled compound, for example,a solid phase for immobilizing the F06B09 protein. Compartmentscontaining reagents, and instructions, will normally be provided.

Antibodies, including antigen binding fragments, specific for the F06B09protein, or fragments thereof, are useful in diagnostic applications todetect the presence of elevated levels of F06B09 protein and/or itsfragments. Such diagnostic assays can employ lysates, live cells, fixedcells, immunofluorescence, cell cultures, body fluids, and further caninvolve the detection of antigens related to the ligand in serum, or thelike. Diagnostic assays may be homogeneous (without a separation stepbetween free reagent and F06B09 protein complex) or heterogeneous (witha separation step). Various commercial assays exist, such asradioimmunoassay (RIA), enzyme-linked immunosorbent assay (ELISA),enzyme immunoassay (EIA), enzyme-multiplied immunoassay technique(EMIT), substrate-labeled fluorescent immunoassay (SLFIA), and the like.For example, unlabeled antibodies can be employed by using a secondantibody which is labeled and which recognizes the antibody to an F06B09protein or to a particular fragment thereof. Similar assays have alsobeen extensively discussed in the literature. See, e.g., Harlow and Lane(1988) Antibodies; A Laboratory Manual, CSH Press, NY; Chan (ed. 1987)Immunoassay: A Practical Guide Academic Press, Orlando, Fla.; Price andNewman (eds. 1991) Principles and Practice of Immunoassay StocktonPress, NY; and Ngo (ed. 1988) Nonisotopic Immunoassay Plenum Press, NY.

Anti-idiotypic antibodies may have similar use to diagnose presence ofantibodies against an F06B09 protein, as such may be diagnostic ofvarious abnormal states. For example, overproduction of F06B09 proteinmay result in production of various immunological or other medicalreactions which may be diagnostic of abnormal physiological states,e.g., in cell growth, activation, or differentiation.

Frequently, the reagents for diagnostic assays are supplied in kits, soas to optimize the sensitivity of the assay. For the subject invention,depending upon the nature of the assay, the protocol, and the label,either labeled or unlabeled antibody, or labeled F06B09 protein isprovided. This is usually in conjunction with other additives, such asbuffers, stabilizers, materials necessary for signal production such assubstrates for enzymes, and the like. Preferably, the kit will alsocontain instructions for proper use and disposal of the contents afteruse. Typically the kit has compartments for each useful reagent.Desirably, the reagents are provided as a dry lyophilized powder, wherethe reagents may be reconstituted in an aqueous medium providingappropriate concentrations of reagents for performing the assay.

Many of the aforementioned constituents of the drug screening and thediagnostic assays may be used without modification, or may be modifiedin a variety of ways. For example, labeling may be achieved bycovalently or non-covalently joining a moiety which directly orindirectly provides a detectable signal. In any of these assays, theprotein, test compound, F06B09 protein, or antibodies thereto can belabeled either directly or indirectly. Possibilities for direct labelinginclude label groups: radiolabels such as ¹²⁵I, enzymes (U.S. Pat. No.3,645,090) such as peroxidase and alkaline phosphatase, and fluorescentlabels (U.S. Pat. No. 3,940,475) capable of monitoring the change influorescence intensity, wavelength shift, or fluorescence polarization.Possibilities for indirect labeling include biotinylation of oneconstituent followed by binding to avidin coupled to one of the abovelabel groups.

There are also numerous methods of separating the bound from the freeantigen, or alternatively the bound from the free test compound. TheF06B09 protein can be immobilized on various matrices followed bywashing. Suitable matrices include plastic such as an ELISA plate,filters, and beads. Methods of immobilizing the F06B09 protein to amatrix include, without limitation, direct adhesion to plastic, use of acapture antibody, chemical coupling, and biotin-avidin. The last step inthis approach usually involves the precipitation of enzyme/antibody orenzyme substrate complex by various methods including those utilizing,e.g., an organic solvent such as polyethylene glycol or a salt such asammonium sulfate. Other suitable separation techniques include, withoutlimitation, the fluorescein antibody magnetizable particle methoddescribed in Rattle, et al. (1984) Clin. Chem. 30:1457-1461, and thedouble antibody magnetic particle separation as described in U.S. Pat.No. 4,659,678.

Methods for linking proteins or their fragments to the various labelshave been extensively reported in the literature and do not requiredetailed discussion here. Many of the techniques involve the use ofactivated carboxyl groups either through the use of carbodiimide oractive esters to form peptide bonds, the formation of thioethers byreaction of a mercapto group with an activated halogen such aschloroacetyl, or an activated olefin such as maleimide, for linkage, orthe like. Fusion proteins will also find use in these applications.

Another diagnostic aspect of this invention involves use ofoligonucleotide or polynucleotide sequences taken from the sequence ofan F06B09 protein. These sequences can be used as probes for detectinglevels of the F06B09 protein message in samples from natural sources, orpatients suspected of having an abnormal condition, e.g., immuneproblem. The preparation of both RNA and DNA nucleotide sequences, thelabeling of the sequences, and the preferred size of the sequences hasreceived ample description and discussion in the literature. Normally anoligonucleotide probe should have at least about 14 nucleotides, usuallyat least about 18 nucleotides, and the polynucleotide probes may be upto several kilobases. Various detectable labels may be employed, mostcommonly radionuclides, particularly ³²P. However, other techniques mayalso be employed, such as using biotin modified nucleotides forintroduction into a polynucleotide. The biotin then serves as the sitefor binding to avidin or antibodies, which may be labeled with a widevariety of labels, such as radionuclides, fluorophores, enzymes, or thelike. Alternatively, antibodies may be employed which can recognizespecific duplexes, including DNA duplexes, RNA duplexes, DNA-RNA hybridduplexes, or DNA-protein duplexes. The antibodies in turn may be labeledand the assay carried out where the duplex is bound to a surface, sothat upon the formation of duplex on the surface, the presence ofantibody bound to the duplex can be detected. The use of probes to thenovel anti-sense RNA may be carried out using many conventionaltechniques such as nucleic acid hybridization, plus and minus screening,recombinational probing, hybrid released translation (HRT), and hybridarrested translation (HART). This also includes amplification techniquessuch as polymerase chain reaction (PCR).

Diagnostic kits which also test for the qualitative or quantitativepresence of other markers are also contemplated. Diagnosis or prognosismay depend on the combination of multiple indications used as markers.Thus, kits may test for combinations of markers. See, e.g., Viallet, etal. (1989) Progress in Growth Factor Res. 1:89-97.

XII. Substrate Identification

Having isolated a protease, methods exist for identifying a targetsubstrate. For example, a candidate substrate can be contacted with anF06B09 protein in an enzymatic reaction. The resulting cleavage orproduct can be analyzed, e.g., using SDS-PAGE, HPLC, spectroscopy orother forms of analysis. For example, the molecular weight of a proteasecleavage product should be compared against the molecular weights of theuncleaved substrate and the F06B09 protein. The successful candidatesubstrate will exhibit a shift to a lower molecular weight. Analysis ofthe substrate should determine what site specificity may exist for theenzyme under the tested conditions. Alternatively, if the protease actsby transforming an inactive substrate to the active form, the resultingactivity can be assayed, e.g., by the result of the activated factor,e.g., proliferation, apoptosis, or activation of a target cell.

Sequence specificity of products may allow search through sequencedatabases to identify candidate proteins as physiologically naturalsubstrates. Alternatively, the protease may be involve in antigenprocessing and presentation to appropriate immune cells.

The broad scope of this invention is best understood with reference tothe following examples, which are not intended to limit the invention tospecific embodiments.

EXAMPLES

I. General Methods

Many of the standard methods below are described or referenced, e.g., inManiatis, et al. (1982) Molecular Cloning. A Laboratory Manual ColdSpring Harbor Laboratory, Cold Spring Harbor Press, NY; Sambrook, et al.(1989) Molecular Cloning: A Laboratory Manual (2d ed.) Vols. 1-3, CSHPress, NY; Ausubel, et al., Biology Greene Publishing Associates,Brooklyn, N.Y.; or Ausubel, et al. (1987 and Supplements) CurrentProtocols in Molecular Biology Wiley/Greene, NY; Innis, et al. (eds.1990) PCR Protocols: A Guide to Methods and Applications Academic Press,NY. Methods for protein purification include such methods as ammoniumsulfate precipitation, column chromatography, electrophoresis,centrifugation, crystallization, and others. See, e.g., Ausubel, et al.(1987 and periodic supplements); Deutscher (1990) “Guide to ProteinPurification, ” Methods in Enzymology vol. 182, and other volumes inthis series; and manufacturer's literature on use of proteinpurification products, e.g., Pharmacia, Piscataway, N.J., or Bio-Rad,Richmond, Calif. Combination with recombinant techniques allow fusion toappropriate segments (epitope tags), e.g., to a FLAG sequence or anequivalent which can be fused, e.g., via a protein-removable sequence.See, e.g., Hochuli (1989) Chemische Industrie 12:69-70; Hochuli (1990)“Purification of Recombinant Proteins with Metal Chelate Absorbent” inSetlow (ed.) Genetic Engineering, Principle and Methods 12:87-98, PlenumPress, NY; and Crowe, et al. (1992) OIAexpress: The High LevelExpression & Protein Purification System QUIAGEN, Inc., Chatsworth,Calif.

Standard immunological techniques are described, e.g., in Hertzenberg,et al. (eds. 1996) Weir's Handbook of Experimental Immunology vols. 1-4,Blackwell Science; Coligan (1991) Current Protocols in ImmunologyWiley/Greene, NY; and Methods in Enzymology volumes. 70, 73, 74, 84, 92,93, 108, 116, 121, 132, 150, 162, and 163. Assays for neural cellbiological activities are described, e.g., in Wouterlood (ed. 1995)Neuroscience Protocols modules 10, Elsevier; Methods in NeurosciencesAcademic Press; and Neuromethods Humana Press, Totowa, N.J. Methodologyof developmental systems is described, e.g., in Meisami (ed.) Handbookof Human Growth and Developmental Biology CRC Press; and Chrispeels(ed.) Molecular Techniques and Approaches in Developmental BiologyInterscience.

FACS analyses are described in Melamed, et al. (1990) Flow Cytometry andSorting Wiley-Liss, Inc., New York, N.Y.; Shapiro (1988) Practical FlowCytometry Liss, New York, N.Y.; and Robinson, et al. (1993) Handbook ofFlow Cytometry Methods Wiley-Liss, New York, N.Y.

II. Hematopoietic Factors, Cells, and Cell Lines

rhGM-CSF (specific activity : 2×10⁶ U/mg; Schering-Plough ResearchInstitute, Kenilworth, N.J.) was used at a saturating concentration of100 ng/ml (200 U/ml). rhTNFα (specific activity : 2×10⁷ U/mg; Genzyme,Boston, Mass.) was used at an optimal concentration of 2.5 ng/ml (50U/ml). rhSCF (specific activity : 4×10⁵ U/mg; R&D, Abington, U.K.) andrhM-CSF (specific activity : 2×10⁶ U/mg; R&D) were used at optimalconcentration of 25 ng/ml. rhG-CSF (ED₅₀: 0.01-0.03 ng/ml R&D) was usedat an optimal concentration of 25 ng/ml.

Peripheral blood mononuclear cells (PBMC) were obtained from healthydonors after Ficoll-Hypaque gradient centrifugation (d=1.077; Eurobio,Paris, France). T cells were purified from PBMC by immunomagneticdepletion (Dynal, Oslo, Norway) using a cocktail of mAbs (CD14, CD16,CD35, HLA-DR, (Immunotech, Marseille, France), CD19 (ascites), NKH1(Coulter, Hialeah, Fla.), CD40 (mAb 89 produced in the laboratory)). Thepurity of CD3⁺ T cells was higher than 95%. T cells were activated withcoated anti-CD3 and soluble anti-CD28 mAbs for 3, 12 and 24 h. B cellswere obtained from human tonsils as described (Liu, et al. (1996)Immunity. 4:603-613). Briefly, T cells were first depleted by rosettingsheep red blood cells and then the residual non-B cells were removed byimmunomagnetic depletion using a cocktail of mAbs (CD2, CD3, CD4, CD14,CD16, NKH1, CD35). The purity of CD19⁺ B cells was higher than 98%.Langerhans cells were prepared from normal skin by CD1a positiveselection as described (Le Varlet, et al. (1992) J. Leukoc. Biol.51:415-420). Granulogytes were generated in vitro from CD34⁺ progenitorsin the presence of G-CSF and SCF for 12 days. Macrophages were generatedin vitro by culturing human cord blood CD34⁺ progenitors with M-CSF andSCF for 12 days (Szabolcs, et al. (1996) Blood. 87:4520-4530). Cellswere unactivated or activated by PMA-ionomycin for 1 and 6 h (PMA: 1ng/ml, Sigma, St. Louis, Mo.; Ionomycin: 1 μg/ml, Calbiochem, La Jolla,Calif.) and pooled. The TF1 (erythrocytic), Jurkat (T cell), MRC5(fibroblastic), JY (lymphoblastoid B cell), and U937 (myelomonocytic)cell lines were obtained from American Type Culture Collection (ATCC,Rockville, Md.). CHA is an epithelial kidney carcinoma cell line kindlyprovided by C. Bain (Centre Léon Bérard, Lyon, France). All cell lineswere stimulated by PMA-ionomycin for 1 h and 6 h and pooled. Murinefibroblasts transfected with human CD40 ligand (CD40L L cells) wereproduced in the laboratory (Garrone, et al. (1995) J. Exp. Med.182:1265-1273. All cell types were cultured in RPMI 1640 (GIBCO BRL,Gaithersburg, Md.) supplemented with 10% (vol/vol) heat-inactivatedfetal bovine serum (FBS; Flow laboratories, Irvine, UK), 10 mM Hepes, 2mM L-glutamine, 5×10⁻⁵ M 2-mercaptoethanol, penicillin (100 U/ml) andstreptomycin (100 μg/ml; hereafter referred to as complete medium).Generation of DC from CD34⁺ cells and from monocytes.

Umbilical cord blood samples were obtained according to appropriateinstitutional guidelines. Isolation of CD34⁺ progenitors was achievedusing Minimacs separation columns (Miltenyi Biotec GmbH) as described byCaux, et al. (1996) J. Exp. Med. 184:695-706. In all experiments, theisolated cells were 80 to 99% CD34⁺ as judged by staining with anti-CD34mAb. Cultures of CD34⁺ cells were established in the presence of SCF,GM-CSF, and TNFα as described by Caux, et al. (1992) Nature 360:258-261;or Caux, et al. (1996) J. Exp. Med. 184:695-706. Cells collected after 6days of culture were separated according to CD1a and CD14 expressioninto CD14⁺CD1a⁻ and CD14-CD1a⁺ using a FACStar⁺ (Becton Dickinson,Mountain View, Calif.) as described (Caux, et al. (1996) J. Exp. Med.184:695-706). Cells were further cultured in presence of GM-CSF and TNFαuntil day 12-17, when 70-90% of cells are CD1a⁺ DC. Monocytes werepurified by immunomagnetic depletion (Dynal) after preparation of PBMCfollowed by a 52% Percoll gradient. The depletion was performed withanti-CD3, anti-CD19, and anti-CD8 ascites, and with purified anti-NKH1(Coulter) and anti-CD16 (Immunotech) mAbs. Monocyte-derived dendriticcells were produced by culturing purified monocytes for 6 days in thepresence of GM-CSF and IL-4 (Sallusto and Lanzavecchia (1994) J. Exp.Med. 179:1109-1118). Cells were activated with LPS at the concentrationof 25 ng/ml for 1 h to 72 h or with CD40L transfected L cells (one Lcell for five DC) (Caux, et al. (1994) J. Exp. Med. 180:1263-1272).

III. cDNA Libraries and Isolation of F06B09 cDNA Clone

Total RNA was isolated from PMA-ionomycin activated CD1a⁻CD14⁺ DC (atday 12 of the culture) and from activated CHA cell line. See Chomczynskiand Sacchi (1987) Anal. Biochem. 162:156-159. RNA was treated with DNaseI before mRNA purification using the Oligotex-dT kit (Qiagen GmbH,Hilden, Germany). PolyA+ RNA (2 μg) was used to make a cDNA library inthe pSport vector (Superscript Plasmid System Kit, GIBCO BRL). Asubtraction library was made using the method of Hara et al. (1994)Blood 84:189-199, with minor modifications. In this protocol, tracer(subtracted) cDNA was the CD14⁺-derived DC cDNA, and driver(subtractive) cDNA was CHA cDNA. A 0.6 Kb cDNA containing a polyA tailwas isolated from the CD14⁺-derived DC subtraction library. The cDNA ofthis gene was first amplified using the RACE MARATHON™ kit (Clontech,Palo Alto, Calif.) and two oligonucleotides:5′CAGAAATGCCACGAAACAGCCAGGTACT (NGSP1; SEQ ID NO: 5) and5′GCCCCAGTTGCTCATACAAACAGATCAG (GSP1; SEQ ID NO: 6) with a recommendedcycling program 1. PCR products were cloned in the pCRII plasmid(Invitrogen, San Diego, Calif.). A lambda CD34⁺-derived DC cDNA librarywas constructed using the GREAT LENGTHS™ cDNA Synthesis kit with theλTriplEx vector (Clontech), and was next screened with a 5′ F06B09 probeto obtain a full-length cDNA. Sequencing was performed on both strandsby the dideoxynucleotide method using a Taq Dye Deoxy Terminator CycleSequencing kit (Applied Biosystems, Foster City, Calif.) and anautomated sequencer (Applied Biosystems).

A clone encoding the human F06B09 protein is isolated from a naturalhuman dendritic cell or other source, by one of many different possiblemethods. Given the sequences provided herein, PCR primers orhybridization probes are selected and/or constructed to isolate anucleic acid, e.g., genomic DNA segments or cDNA reverse transcripts.Appropriate cell sources include human tissues, e.g., dendritic celllibraries. Tissue distribution below also suggests source tissues.Genetic and polymorphic or allelic variants are isolated by screening apopulation of individuals.

This clone was discovered via EST analysis of human dendritic cellsubtraction cDNA library. The driver was CD34+ derived, CD14+, PMA, andionomycin activated dendritic cell cDNA, while subtractor was PMA andionomycin activated, kidney carcinoma cell line CHA. The initial poly-Acontaining EST was selected for its restricted distribution afterNorthern blot and semi-quantitative PCR analysis. A novel 0.6 Kb partialcDNA (F06B09) was isolated by screening a PMA-ionomycin activatedCD34⁺-derived DC subtraction library. Northern blots probed with theF06B09 clone showed a 3.7 Kb mRNA transcript predominantly expressed indendritic cells. This mRNA was absent by RT-PCR in CHA, the driverepithelial cell line used for subtraction. The 5′ end of the originalsequence was extended by RACE and by the screening of a lambda DC cDNAlibrary, to a final 3691 bp cDNA. This cDNA contains a methionine codonlocated in a consensus Kozak sequence. See Kozak (1986) Cell.44:283-292. The full-length cDNA (see Table 1) shows a 342 bp 5′untranslated sequence (nt 1-342), a 1689 bp open reading frame (nt343-2031), a 3′ untranslated sequence of 1660 bp (nt 2032-3691) and apolyadenylation signal AATAAA at position 3638-3643 followed by apoly(A)tail. The encoded protein of 562 amino acids reveals a stronghomology with membrane-type matrix metalloproteinase (MT-MMP). As amember of the metalloproteinase family, F06B09 contains a propeptidedomain with a cysteine-switch activation domain at Cys69 (Van Wart andBirkedal-Hansen (1990) Proc. Natl. Acad. Sci. USA. 87:5578-5582) and thecore enzyme domain contains three zinc-chelating histidine (H) residuesat positions 212, 216 and 222 in the zinc binding motif. Like otherMT-MMP members, F06B09 presents a consensus insertion RRRR betweenresidues 82-86, corresponding to a furin cleavage site (Table 1). Thesequence is followed by a hinge region (260-290) and a potentialtransmembrane domain of 12 amino acids (526-577) in a hemopexin-likedomain (291-538) and a short intracytoplasmic domain (538-541). Multiplealignment with members of the membrane-type matrix metalloproteinase(MT-MMP) family revealed the closest homology with the MT4-MMP (48%)(Puente, et al. (1996) Cancer Res. 56:944-949), and 38%, 39% and 35%respectively with the MT1-MMP (Sato, et al. (1994) Nature 370:61-65) theMT2-MMP (Will and Hinzmann (1995) Eur. J. Biochem. 231:602-608) and theMT3-MMP (Takino, et al. (1995) J. Biol. Chem. 270:23013-23020).Comparison of the most conserved domain, the catalytic domain, showedthat F06B09 presents the highest homology to MT4-MMP (48%) (Puente, etal. (1996) Cancer Res. 56:944-949) and significant homologies to othermembers of the matrix membrane metalloproteinase family (MMP), like thetype IV collagenases MMP-9 and MMP-2 (Collier, et al. (1988) J. Biol.Chem. 263:6579-6587; Wilhelm, et al. (1989) J. Biol. Chem.264:17213-17221).

Therefore, it is suggested that the F06B09 is a fifth member of theMT-MMP subgroup of the MMP.

The coding sequence appears to be complete, encoding a 21 amino acidputative signal peptide followed by a 541 residue polypeptide withsignificant homology to the membrane-type matrix metalloproteasesMT-MMP1 to 4. No evidence yet suggests alternative splicing of thismessage The limited EST distribution is indicative of a restrictedexpression pattern.

Further clones will be isolated, e.g., using an antibody based selectionprocedure. Standard expression cloning methods are applied including,e.g., FACS staining of membrane associated expression product. Theantibodies are used to identify clones producing a recognized protein.Alternatively, antibodies are used to purify an F06B09 protein, withprotein sequencing and standard means to isolate a gene encoding thatprotein.

Genomic or cDNA sequence based methods will also allow foridentification of sequences naturally available, or otherwise, whichexhibit homology to the provided sequences.

IV. Isolation of Mouse F06B09

Similar methods are used as above to isolate an appropriate F06B09protein gene. See, e.g., GenBank Accession numbers X91785, X83537,D63579, and U54984. Similar source materials as indicated above are usedto isolate natural genes, including genetic, polymorphic, allelic, orstrain variants. Species variants are also isolated using similarmethods. Various sequence databases may suggest related or counterpartsequences. See, e.g., Capone, et al. (1996) J. Immunol. 157:969-973.

V. Isolation of an Avian F06B09 Protein Clone

An appropriate avian source is selected as above. Similar methods areutilized to isolate other species variants, though the level ofsimilarity will typically be lower for avian F06B09 protein as comparedto a human to mouse sequence.

VI. Message Distribution

PCR based detection is performed by standard methods, preferably usingappropriate primers from opposite ends of the coding sequence, butflanking segments might be selected for specific purposes.

Alternatively, hybridization probes are selected. Particular AT or GCcontents of probes are selected depending upon the expected homology andmismatching expected. Appropriate stringency conditions are selected tobalance an appropriate positive signal to background ratio. Successivewashing steps are used to identify clones of greater homology.

Total RNA (20 μg), extracted from cell lines or cell preparations asdescribed above, were fractionated by electrophoresis on a 1%agarose-formaldehyde gel and transferred onto positively charged nylonmembrane (GeneScreenPlus, NEN Life Science Products, Boston, Mass.) asdescribed by Thomas (1980) Proc. Natl. Acad. Sci. USA. 77:5201-5205.After transfer, blots were cross-linked by UV light (Stratalinker, LaJolla, Calif.). The original cloned 600 bp fragment was labeled byrandom priming with ³²P-dCTP (3000 Ci/mmol, Amersham; Ready to Go,Pharmacia Biotech, Orsay, France) and unincorporated nucleotides wereremoved by spin column chromatography (Chromaspin-100, Clontech).Hybridization and washes were performed in stringent conditions(0.1×SSC/0.1% SDS at 65° C.). X-ray films (Kodak, Rochester, N.Y.) wereexposed for 3 weeks at −80° C. with intensifying screens. Multipletissue normal fetal and adult organs Northern blots (purchased fromClontech) were similarly used according to the manufacturer'srecommendations.

For RT-PCR methods, sotal RNA extracted from 1 to 10×10⁶ cells(Chomczynski and Sacchi (1987) Anal. Biochem. 162:156-159) were reversetranscribed using random hexamer primers (Pharmacia, Upsalla, Sweden)and the Superscript RNase-H reverse transcriptase (GIBCO BRL). PCR wasperformed in a 100 μl volume using 5 ng cDNA, 10 μl 10×PCR reactionbuffer (Perkin Elmer Cetus, Norwalk, Conn.), 2.5 U of Taq polymerase(Gene Amp PCR reagents kit: Perkin Elmer Cetus) and 200 mM dNTPs and 500nM of the 5′ and 3′ amplification primers. The PCR reactions were madein a DNA thermal cycler (Perkin Elmer) for 35 cycles (1 min denaturationat 94° C., 1 min annealing at 60° C., and 2 min elongation at 72° C.). βactin RT-PCR was used as positive control for the efficiency of thereaction using sense and antisense primers. Appropriate sense andantisense primers were used to amplify F06B09. See Table 1.

Northern blot showed a single band of about 4.5 kb in non-activated andPMA and ionomycin-activated, CD34+ derived human DC, and a weak signalin in vitro generated granulogytes. No signal was detected in TF1,Jurkat, CHA, or JY cell lines, nor in freshly isolated monocytes,activated T cells, resting and activated PBLs, or B cells. No expressionwas found in either fetal or adult tissues. PCR distribution analysisshowed expression in activated DC and the MRC5 lung fibroblast cellline, as well as very low signal in U937. The original EST was extendedby 5′ RACE. A lambda DC cDNA library was screened with a 5′ probe. AnORF was identified, which showed highest homology with the humanMT4-MMP, a recent addition to the membrane type matrix metalloproteasefamily. Positive signals were also detected in granulogytes.

Northern blot analysis showed a single ˜4 Kb transcript predominantlyexpressed in resting CD34⁺-derived DC, to a lesser extent inPMA-ionomycin activated DC, and weakly in granulogytes generated invitro. The expression pattern of this novel gene was also analyzed byRT-PCR, on freshly isolated cells and on various cell lines. Similarly,RT-PCR analysis confirmed the higher level of F06B09 expression inresting and activated CD34⁺-derived DC, in granulogytes and to a weakerextent in resting PBLs. F06B09 mRNA is also weakly present on the B cellline, JY. No messenger was detected in TF1 (myelo-erythrocytic), CHA(carcinoma), Jurkat (T cell), MRC5 (fibroblastic) or U937(myelo-monocytic) cell lines, nor in freshly isolated monocytes,activated T and B cells or activated PBLs. Among normal human tissuestested, a significant band of ˜4 Kb was seen in spleen, lymph node,thymus, appendix, PBL and in bone marrow but absent in fetal tissues. Anadditional 6 Kb band corresponding probably to unspecific expression orto a longer existing form, was also detected in spleen, PBL and bonemarrow but absent in lymph node, thymus, and appendix.

The cellular distribution of F06B09 was next determined by the extent ofhybridization among the gel-fractionated population of cDNA inserts fromlibraries made from different cell types. Consistent with the aboveobservations, F06B09 is present in both CD34⁺-derived DC and inmonocyte-derived DC, but also in effector T cells, including Th1 and Th2cells, and to a weaker extent in NK cells. F06B09 mRNA expression isdown-regulated after PMA-ionomycin activation at once in DC and T cells.No signal was detected in monocytes, B cell lines nor in different fetaltissues.

In conclusion, the novel MT-MMP appears to be mainly transcribed byresting DC and weakly by effector T cells.

F06B09 mRNA is strongly expressed in different types of DC anddown-regulated by CD40L activation. Since the original F06B09 clone wasidentified in a DC library, further characterization was performed bysemi-quantitative RT-PCR. The expression of this gene was analyzedduring DC differentiation and maturation, either in DC generated invitro from CD34⁺ cord blood progenitors cultured with GM-CSF and TNFα orfrom monocytes cultured with GM-CSF and IL-4. During the culture ofCD34⁺ human cord blood progenitors, F06B09 is first detected at day 6and increases up to day 12. This messenger is down-regulated aftertriggering final maturation of the DC by 4 days co-culture withCD40L-transfected L cells. Similarly, while monocytes do not expressdetectable amount of F06B09 mRNA, a significant expression could bedetected after 6 days of culture in the presence of GM-CSF and IL-4. Incontrast, following activation of these monocyte-derived DC throughCD40, the level of mRNA decreases rapidly within 3 h to 12 h. A lowamount of F06B09 mRNA is also found in 1 h PMA-ionomycin activated CD1a⁺and CD14⁺ DC subsets, which is down-regulated after 6h PMA-ionomycinactivation. Day 12 macrophages generated in vitro express also weaklyF06B09, and 6 h PMA-ionomycin activation of these cells is enough toswitch off the signal. In contrast, no signal is detected in freshlyisolated Langerhans cells, in basal keratinocytes, in freshly isolatedand CD40L activated B cells, and in anti-CD3 and anti-CD28 activated Tcells.

Taken together, these results confirm that F06B09 mRNA is expressed indifferent DC subtypes and rapidly down-regulated upon DC maturation.

VII. Chromosomal Localization.

The full-length cDNA sequence of F06B09 was analyzed against the EMBLnucleotide and EST databases and resulted in identification of a 436 bpEST (W72721), matching exactly with the F06B09 sequence.

Comparison of the full-length cDNA sequence of F06B09 with the EMBLnucleotide and EST databases identifies a 436 bp EST (W72721),corresponding exactly to the F06B09 sequence. This EST is located onchromosome 16p13.3. In contrast, MT1-MMP and MT3-MMP have beenpreviously located on chromosome 14q11-12 and 8q21.3-22.1 respectively(Mattei, et al. (1997) Genomics. 40:168-169.; Mignon, et al. (1995)Genomics. 28:360-361). Of note, the novel F06B09 MT-MMP gene is on thesame chromosome than MT2-MMP, but both genes are located on a differentloci; MT2-MMP is on chromosome 16q12 (Mattei, et al. (1997) Genomics.40:168-169; Yasumitsu, et al. (1997) DNA Res. 4:77-79) whereas the novelMT-MMP is on chromosome 16p13.3.

VIII. Expression; Purification; Characterization

With an appropriate clone from above, the coding sequence is insertedinto an appropriate expression vector. This may be in a vectorspecifically selected for a prokaryote, yeast, insect, or highervertebrate, e.g., mammalian expression system. Standard methods areapplied to produce the gene product, preferably as a soluble secretedmolecule, but will, in certain instances, also be made as anintracellular protein. Intracellular proteins typically require celllysis to recover the protein, and insoluble inclusion bodies are acommon starting material for further purification.

With a clone encoding a vertebrate F06B09 protein, recombinantproduction means are used, although natural forms may be purified fromappropriate sources, e.g., expressing cell lines. The protein product ispurified by standard methods of protein purification, in certain cases,e.g., coupled with imuunoaffinity methods. Immunoaffinity methods areused either as a purification step, as described above, or as adetection assay to determine the partition properties of the protein.

Preferably, the protein is secreted into the medium, and the solubleproduct is purified from the medium in a soluble form. Standardpurification techniques applied to soluble proteins are then applied,with enzyme assays or immunodetection methods useful for following wherethe protease purifies. Alternatively, as described above, inclusionbodies from prokaryotic expression systems are a useful source ofmaterial. Typically, the insoluble protein is solubilized from theinclusion bodies and refolded using standard methods. Purificationmethods are developed as described above.

In certain embodiments, the protein is made in a eukaryotic cell whichglycosylates the protein normally. The purification methods may beaffected thereby, as may biological activities. The intact protein canbe processed to release the protein domain, probably due to a cleavageevent. While recombinant protein appears to be processed, thephysiological processes which normally do such in native cells remain tobe determined.

The product of the purification method described above is characterizedto determine many structural features. Standard physical methods areapplied, e.g., amino acid analysis and protein sequencing. The resultingprotein is subjected to CD spectroscopy and other spectroscopic methods,e.g., NMR, ESR, mass spectroscopy, etc. The product is characterized todetermine its molecular form and size, e.g., using gel chromatographyand similar techniques. Understanding of the chromatographic propertieswill lead to more gentle or efficient purification methods.

Prediction of glycosylation sites may be made, e.g., as reported inHansen, et al. (1995) Biochem. J. 308:801-813.

IX. Preparation of Antibodies Against Vertebrate F06B09 Protein

With protein produced and purified, as above, animals are immunized toproduce antibodies. Polyclonal antiserum may be raised usingnon-purified antigen, though the resulting serum will exhibit higherbackground levels. Preferably, the antigen is purified using standardprotein purification techniques, including, e.g., affinitychromatography using polyclonal serum indicated above. Presence ofspecific antibodies is detected using defined synthetic peptidefragments.

Alternatively, polyclonal serum is raised against a purified antigen,purified as indicated above, or using synthetic peptides. A series ofoverlapping synthetic peptides which encompass all of the full lengthsequence, if presented to an animal, will produce serum recognizing mostlinear epitopes on the protein. Such an antiserum is used to affinitypurify protein. This purified protein, in turn, may be used to immunizeanother animal to produce another antiserum preparation.

Standard techniques are used to generate induce monoclonal antibodies toeither unpurified antigen, or, preferably, purified antigen.

X. Structure Activity Relationship

Information on the criticality of particular residues is determinedusing standard procedures and analysis. Standard mutagenesis analysis isperformed, e.g., by generating many different variants at determinedpositions, e.g., at the positions identified above, and evaluatingbiological activities of the variants. This may be performed to theextent of determining positions which modify activity, or to focus onspecific positions to determine the residues which can be substituted toeither retain, block, or modulate biological activity.

Alternatively, analysis of natural variants can indicate what positionstolerate natural mutations. This may result from populational analysisof variation among individuals, or across strains or species. Samplesfrom selected individuals are analyzed, e.g., by PCR analysis andsequencing. This allows evaluation of population polymorphisms.

All references cited herein are incorporated herein by reference to thesame extent as if each individual publication or patent application wasspecifically and individually indicated to be incorporated by referencein its entirety for all purposes.

Many modifications and variations of this invention can be made withoutdeparting from its spirit and scope, as will be apparent to thoseskilled in the art. The specific embodiments described herein areoffered by way of example only, and the invention is to be limited onlyby the terms of the appended claims, along with the full scope ofequivalents to which such claims are entitled.

10 3695 base pairs nucleic acid single linear cDNA not provided CDS344..2032 mat_peptide 398..2032 misc_feature 3458 /note= “nucleotide3458 designated W, may be A or T” 1 CATGCAACAT AATCTTGCTC GATTCTAAAGTCAACGGATC CTGCAAAATT CGCGGCCGCG 60 TCAACCCATT AGGTCTTGGC CTTGGAATAAAATTGCTTCT CGTCTGATTC CCGGGCCCAC 120 CCGACCCAGC GGCGCAACCC TGGCCCTCCGGGACCCTCCG CTGACTCCAC CGCGCACTTC 180 CCGGGACCCC CACACACATC CCAGCCCTCCGGCCGATCCC TCCCTACTCG GTGCCGGGTG 240 CCCCCCTTTT TTTTCTAGGC CCGGATCTCCTCCCCCAGGT CCCCGGGGCG GCCCCAACCA 300 GGCCCCCTTC AAACCCCGCC GGCGGCCCGGGCTGGGGCGC ACC ATG CGG CTG CGG 355 Met Arg Leu Arg -18 -15 CTC CGG CTTCTG GCG CTG CTG CTT CTG CAT GCT GGC ACC GCC CGC GCG 403 Leu Arg Leu LeuAla Leu Leu Leu Leu His Ala Gly Thr Ala Arg Ala -10 -5 1 CGC CCC GAA GCCCTC GGC GCA GGA CTT AGC CTG GGC TGT GAG AAC TGG 451 Arg Pro Glu Ala LeuGly Ala Gly Leu Ser Leu Gly Cys Glu Asn Trp 5 10 15 CTG ACT CGC TAT GGTTAC CTA CCG CCA CCC GAC CCT GCC CAG GCC CAG 499 Leu Thr Arg Tyr Gly TyrLeu Pro Pro Pro Asp Pro Ala Gln Ala Gln 20 25 30 CTG CAG AGC CCT GAA AATTTG CGC GAT GCC ATC AAA GTC ATG CAA AGG 547 Leu Gln Ser Pro Glu Asn LeuArg Asp Ala Ile Lys Val Met Gln Arg 35 40 45 50 TTC GCG GGG CTG CCG GAGACC GGC CGC ATG GAC CCA GGG ACA GTG GCC 595 Phe Ala Gly Leu Pro Glu ThrGly Arg Met Asp Pro Gly Thr Val Ala 55 60 65 ACC ATG CGT AAG CCC CGC TGCTCC CTG CCT GAC GTG CTG GGG GTG GCG 643 Thr Met Arg Lys Pro Arg Cys SerLeu Pro Asp Val Leu Gly Val Ala 70 75 80 GGG CTG GTC AGG CGG CGT CGC CGGTAC GGT CTG AGC GGC AGC GTG TGG 691 Gly Leu Val Arg Arg Arg Arg Arg TyrGly Leu Ser Gly Ser Val Trp 85 90 95 GAG AAG CGA ACC GTG ACA TGG AGG GTACGT TCC TTC CCC CAG AGC TCC 739 Glu Lys Arg Thr Val Thr Trp Arg Val ArgSer Phe Pro Gln Ser Ser 100 105 110 CAG GTG AGC CAG GAG ACC GTG CGG GTCCTC GTG AGC TAT GCC CTG ATG 787 Gln Val Ser Gln Glu Thr Val Arg Val LeuVal Ser Tyr Ala Leu Met 115 120 125 130 GCG TGG GGC ATG GAG TCA GGC CTCACA TTT CAT GAG GTG GAT TCC CCC 835 Ala Trp Gly Met Glu Ser Gly Leu ThrPhe His Glu Val Asp Ser Pro 135 140 145 CAG GGC CAG GAG CCC GAC ATC CTCATA GAC TTT GCC CGC GCC TTC CAA 883 Gln Gly Gln Glu Pro Asp Ile Leu IleAsp Phe Ala Arg Ala Phe Gln 150 155 160 CAG GAC AGC TAC CCC TTC GAC GGGTTG GGG GGC ACC CTA GCC CAT GCC 931 Gln Asp Ser Tyr Pro Phe Asp Gly LeuGly Gly Thr Leu Ala His Ala 165 170 175 TTC TTC CCT GGG GAG CAC CCC ATCTCC GGG GAC ACT CAC TTT GAC GAT 979 Phe Phe Pro Gly Glu His Pro Ile SerGly Asp Thr His Phe Asp Asp 180 185 190 GAG GAG ACC TGG ACT TTT GGG TCAAAA GAC GGC GAG GGG ACC GAC CTG 1027 Glu Glu Thr Trp Thr Phe Gly Ser LysAsp Gly Glu Gly Thr Asp Leu 195 200 205 210 TTT GCC GTG GCT GTC CAT GAGTTT GGC CAC GCC CTG GGC ATG GGC CAC 1075 Phe Ala Val Ala Val His Glu PheGly His Ala Leu Gly Met Gly His 215 220 225 TCC TCA GCC CCC GAC TCC ATTATG AGG CCC TTC TAC CAG GGT CCG GTG 1123 Ser Ser Ala Pro Asp Ser Ile MetArg Pro Phe Tyr Gln Gly Pro Val 230 235 240 GGC GAC CCT GAC AAG TAC CGCCTG TCT CTG GAT GAC CGC GAT GGC CTG 1171 Gly Asp Pro Asp Lys Tyr Arg LeuSer Leu Asp Asp Arg Asp Gly Leu 245 250 255 CAG CAA CTC TAT GGG AAG GCGCCC CAA ACC CCA TAT GAC AAG CCC ACA 1219 Gln Gln Leu Tyr Gly Lys Ala ProGln Thr Pro Tyr Asp Lys Pro Thr 260 265 270 AGG AAA CCC CTG GCT CCT CCGCCC CAG CCC CCG GCC TCG CCC ACA CAC 1267 Arg Lys Pro Leu Ala Pro Pro ProGln Pro Pro Ala Ser Pro Thr His 275 280 285 290 AGC CCA TCC TTC CCC ATCCCT GAT CGA TGT GAG GGC AAT TTT GAC GCC 1315 Ser Pro Ser Phe Pro Ile ProAsp Arg Cys Glu Gly Asn Phe Asp Ala 295 300 305 ATC GCC AAC ATC CGA GGGGAA ACT TTC TTC TTC AAA GGC CCC TGG TTC 1363 Ile Ala Asn Ile Arg Gly GluThr Phe Phe Phe Lys Gly Pro Trp Phe 310 315 320 TGG CGC CTC CAG CCC TCCGGA CAG CTG GTG TCC CCG CGA CCC GCA CGG 1411 Trp Arg Leu Gln Pro Ser GlyGln Leu Val Ser Pro Arg Pro Ala Arg 325 330 335 CTG CAC CGC TTC TGG GAGGGG CTG CCC GCC CAG GTG AGG GTG GTG CAG 1459 Leu His Arg Phe Trp Glu GlyLeu Pro Ala Gln Val Arg Val Val Gln 340 345 350 GCC GCC TAT GCT CGG CACCGA GAC GGC CGA ATC CTC CTC TTT AGC GGG 1507 Ala Ala Tyr Ala Arg His ArgAsp Gly Arg Ile Leu Leu Phe Ser Gly 355 360 365 370 CCC CAG TTC TGG GTGTTC CAG GAC CGG CAG CTG GAG GGC GGG GCG CGG 1555 Pro Gln Phe Trp Val PheGln Asp Arg Gln Leu Glu Gly Gly Ala Arg 375 380 385 CCG CTC ACG GAG CTGGGG CTG CCC CCG GGA GAG GAG GTG GAC GCC GTG 1603 Pro Leu Thr Glu Leu GlyLeu Pro Pro Gly Glu Glu Val Asp Ala Val 390 395 400 TTC TCG TGG CCA CAGAAC GGG AAG ACC TAC CTG GTC CGC GGC CGG CAG 1651 Phe Ser Trp Pro Gln AsnGly Lys Thr Tyr Leu Val Arg Gly Arg Gln 405 410 415 TAC TGG CGC TAC GACGAG GCG GCG GCG CGC CCG GAC CCC GGC TAC CTT 1699 Tyr Trp Arg Tyr Asp GluAla Ala Ala Arg Pro Asp Pro Gly Tyr Leu 420 425 430 CGC GAC CTG AGC CTCTGG GAA GGC GCG CCC CCC TCC CCT GAC GAT GTC 1747 Arg Asp Leu Ser Leu TrpGlu Gly Ala Pro Pro Ser Pro Asp Asp Val 435 440 445 450 ACC GTC AGC AACGCA GGT GAC ACC TAC TTC TTC AAG GGC GCC CAC TAC 1795 Thr Val Ser Asn AlaGly Asp Thr Tyr Phe Phe Lys Gly Ala His Tyr 455 460 465 TGG CGC TTC CCCAAG AAC AGC ATC AAG ACC GAG CCG GAC GCC CCC CAG 1843 Trp Arg Phe Pro LysAsn Ser Ile Lys Thr Glu Pro Asp Ala Pro Gln 470 475 480 CCC ATG GGG CCCAAC TGG CTG GAC TGC CCC GCC CCG AGC TCT GGT CCC 1891 Pro Met Gly Pro AsnTrp Leu Asp Cys Pro Ala Pro Ser Ser Gly Pro 485 490 495 CGC GCC CCC AGGCCC CCC AAA GGG ACC CCC GTG TCC GAA ACC TGC GAT 1939 Arg Ala Pro Arg ProPro Lys Gly Thr Pro Val Ser Glu Thr Cys Asp 500 505 510 TGT CAG TGC GAGCTC AAC CAG GCC GCA GGA CGT TGG CCT GCT CCC ATC 1987 Cys Gln Cys Glu LeuAsn Gln Ala Ala Gly Arg Trp Pro Ala Pro Ile 515 520 525 530 CCG CTG CTCCTC TTG CCC CTG CTG GTG GGG GGT GTA GCC TCC CGC 2032 Pro Leu Leu Leu LeuPro Leu Leu Val Gly Gly Val Ala Ser Arg 535 540 545 TGATGGGGGGAGCCATCCAG ACCGAACAGC GCCCTCCACG GCCGAGTCCC CCGCCGCTGG 2092 ACCTGGTCGGGGGTTGTGAG GCGCTGCGGA GGCCCCTTGT CTGTTCCCAC GGACGGGGGC 2152 TCGGGCGCGGACTAAGCAGG GGGGATCTCC CGCGCAGGGG CGGCGGCGGC GGGGACCGGT 2212 CGCCTGGCGCTGGGCTCAGT CTCCTCAGGG TCTGAGACCC CGGCGCTGCC ACCGGAACCC 2272 GCCTTCAGGGGCGCACGCGC GCTGGGACCA TGCGTCGGTC GTCGCCCCCG TCGTTCCCTC 2332 CCGGCTGCCGCCAGGGGGCG GTCGGACCCC GCCTCCCGAG CCCGGGGAGG GGCGGGGAGG 2392 ACAAGGGGCGGGCCCGCGGC CTCACCCGGA GGGACGGCAG CCCCGGTCGC GCGCTGGCCC 2452 CGCAGGACCTTCCTTTTCCA GGAAGAGCCA GCTTTTCTCG GAGCGCAGTC CTGGGACTCT 2512 CCGCAGCCCCGCCCCGCCTG GCCACTGCGT CTGGCATTCC TGGGTCGTTA GAGGACAGGC 2572 CTGACTGCGAAGCTGTGCCT TGCCCCTCTC CCACCCGCAG TTTCTCACCC CGTTCTGCTC 2632 CCACAAGGCCCCCCTACAGT CACTGCCACA CTGGTGGGGA CCTGGGACCC AGACCCGGAA 2692 CCAGCCCAGATATCACCCCT GAGGACCCAT GCGCCACGTC CTGGGTGGTG GAATCAGTGG 2752 GTGGAGGGACGACCCTTGCT CTCCAGGCTG TTAACCTTTT CCGTTGCTCC CCCGCCACCC 2812 ACCTCCTCCTCCCCAGGCCA CCCAACTTGG GCACCTCCCT GGGCCCAGAA CTGCCTTCCA 2872 TTCAATGGGGAACCCTTCTA TCCCCAAGAA CCCCTTCCCT GCTTGCACCC TGGAGAGAAC 2932 AGCTTGACTCCCATCAACTC AACGCTGGTG GAAAGACAGG GACCGAACCC TGGCTCAGGC 2992 CTGGTCATTGCCTCCTCAGC ACTCCCTCCT GGGAGGCCTT AGCTCTAGAG TGAGGGGTGG 3052 GTGGAACCTGGGGGCACCTC GTTCACCCTG TCCCCACTCC CCACAGTTTT AGGATCTAAA 3112 TGATTGCCTCTGGAACTATT CTTCTAGACT ATCCCACATC AGAATCACTG GGAAATTTAA 3172 GTTTGCAGATCCCACACTCA CCCTGAATCC TCACTCAGGG TGGGGTCAGG AATCTGCATT 3232 TTAACTAGTCGCGGGGATTG TGGGGGGCAG TAGCTGGCTG TTTCGTGGCA TTTCTGTGGC 3292 TCTGCAGTGTTCCTCCACCC CAGGACCAAT ATGTTCAGGC CACACCGATG GCCTGAACCC 3352 CATGGGTAGAGTCACTTAGG GGCCACTTCC TAAGTTGCTG TCCAGCCTCA GTGACCCCCT 3412 AGTGCTTCCTGGAGCTGAGG CTGTGGGCGG CTGTCCCAGC AACCAWGCGA GGGGTTGCCC 3472 CAGTTGCTCATACAAACAGA TCAGCATGAG GACAGAAGGC AGGAGACTTT GGTCAGTTAC 3532 CTGGGAATTCTGGGCTGCCA GGAAACGATT TGGGCCTCTG TCAGTTTCTT TTCCATGTAT 3592 GAGGAGGGGGAAATTTGTAT ATTAGATACT TATTCATCCC ACTCTGGACA ATAAAAACGA 3652 ATGTACAAAAAAAACATAAA AAAAAAAAAT AAAGAAAATC AAA 3695 563 amino acids amino acidlinear protein not provided 2 Met Arg Leu Arg Leu Arg Leu Leu Ala LeuLeu Leu Leu His Ala Gly -18 -15 -10 -5 Thr Ala Arg Ala Arg Pro Glu AlaLeu Gly Ala Gly Leu Ser Leu Gly 1 5 10 Cys Glu Asn Trp Leu Thr Arg TyrGly Tyr Leu Pro Pro Pro Asp Pro 15 20 25 30 Ala Gln Ala Gln Leu Gln SerPro Glu Asn Leu Arg Asp Ala Ile Lys 35 40 45 Val Met Gln Arg Phe Ala GlyLeu Pro Glu Thr Gly Arg Met Asp Pro 50 55 60 Gly Thr Val Ala Thr Met ArgLys Pro Arg Cys Ser Leu Pro Asp Val 65 70 75 Leu Gly Val Ala Gly Leu ValArg Arg Arg Arg Arg Tyr Gly Leu Ser 80 85 90 Gly Ser Val Trp Glu Lys ArgThr Val Thr Trp Arg Val Arg Ser Phe 95 100 105 110 Pro Gln Ser Ser GlnVal Ser Gln Glu Thr Val Arg Val Leu Val Ser 115 120 125 Tyr Ala Leu MetAla Trp Gly Met Glu Ser Gly Leu Thr Phe His Glu 130 135 140 Val Asp SerPro Gln Gly Gln Glu Pro Asp Ile Leu Ile Asp Phe Ala 145 150 155 Arg AlaPhe Gln Gln Asp Ser Tyr Pro Phe Asp Gly Leu Gly Gly Thr 160 165 170 LeuAla His Ala Phe Phe Pro Gly Glu His Pro Ile Ser Gly Asp Thr 175 180 185190 His Phe Asp Asp Glu Glu Thr Trp Thr Phe Gly Ser Lys Asp Gly Glu 195200 205 Gly Thr Asp Leu Phe Ala Val Ala Val His Glu Phe Gly His Ala Leu210 215 220 Gly Met Gly His Ser Ser Ala Pro Asp Ser Ile Met Arg Pro PheTyr 225 230 235 Gln Gly Pro Val Gly Asp Pro Asp Lys Tyr Arg Leu Ser LeuAsp Asp 240 245 250 Arg Asp Gly Leu Gln Gln Leu Tyr Gly Lys Ala Pro GlnThr Pro Tyr 255 260 265 270 Asp Lys Pro Thr Arg Lys Pro Leu Ala Pro ProPro Gln Pro Pro Ala 275 280 285 Ser Pro Thr His Ser Pro Ser Phe Pro IlePro Asp Arg Cys Glu Gly 290 295 300 Asn Phe Asp Ala Ile Ala Asn Ile ArgGly Glu Thr Phe Phe Phe Lys 305 310 315 Gly Pro Trp Phe Trp Arg Leu GlnPro Ser Gly Gln Leu Val Ser Pro 320 325 330 Arg Pro Ala Arg Leu His ArgPhe Trp Glu Gly Leu Pro Ala Gln Val 335 340 345 350 Arg Val Val Gln AlaAla Tyr Ala Arg His Arg Asp Gly Arg Ile Leu 355 360 365 Leu Phe Ser GlyPro Gln Phe Trp Val Phe Gln Asp Arg Gln Leu Glu 370 375 380 Gly Gly AlaArg Pro Leu Thr Glu Leu Gly Leu Pro Pro Gly Glu Glu 385 390 395 Val AspAla Val Phe Ser Trp Pro Gln Asn Gly Lys Thr Tyr Leu Val 400 405 410 ArgGly Arg Gln Tyr Trp Arg Tyr Asp Glu Ala Ala Ala Arg Pro Asp 415 420 425430 Pro Gly Tyr Leu Arg Asp Leu Ser Leu Trp Glu Gly Ala Pro Pro Ser 435440 445 Pro Asp Asp Val Thr Val Ser Asn Ala Gly Asp Thr Tyr Phe Phe Lys450 455 460 Gly Ala His Tyr Trp Arg Phe Pro Lys Asn Ser Ile Lys Thr GluPro 465 470 475 Asp Ala Pro Gln Pro Met Gly Pro Asn Trp Leu Asp Cys ProAla Pro 480 485 490 Ser Ser Gly Pro Arg Ala Pro Arg Pro Pro Lys Gly ThrPro Val Ser 495 500 505 510 Glu Thr Cys Asp Cys Gln Cys Glu Leu Asn GlnAla Ala Gly Arg Trp 515 520 525 Pro Ala Pro Ile Pro Leu Leu Leu Leu ProLeu Leu Val Gly Gly Val 530 535 540 Ala Ser Arg 545 3691 base pairsnucleic acid single linear cDNA not provided CDS 343..2028 mat_peptide406..2028 misc_feature 3454 /note= “nucleotide 3454 designated W, may beA or T.” 3 CATGCAACAT AATCTTGCTC GATTCTAAAG TCAACGGATC CTGCAAAATTCGCGGCCGCG 60 TCAACCCATT AGGTCTTGGC CTTGGAATAA AATTGCTTCT CGTCTGATTCCCGGGCCCAC 120 CCGACCCAGC GGCGCAACCC TGGCCCTCCG GGACCCTCCG CTGACTCCACCGCGCACTTC 180 CCGGGACCCC CACACACATC CCAGCCCTCC GGCCGATCCC TCCCTACTCGGTGCCGGGTG 240 CCCCCCGCCC TCTCCAGGCC CGGATCTCCT CCCCCAGGTC CCCGGGGCGGCCCCAGCCAG 300 GCCCCCTTCG AACCCCGCCG GCGGCCCGGG CTGGGGCGCA CC ATG CGGCTG CGG 354 Met Arg Leu Arg -21 -20 CTC CGG CTT CTG GCG CTG CTG CTT CTGCTG CTG GCA CCG CCC GCG CGC 402 Leu Arg Leu Leu Ala Leu Leu Leu Leu LeuLeu Ala Pro Pro Ala Arg -15 -10 -5 GCC CCG AAG CCC TCG GCG CAG GAC GTGAGC CTG GGC GTG GAC TGG CTG 450 Ala Pro Lys Pro Ser Ala Gln Asp Val SerLeu Gly Val Asp Trp Leu 1 5 10 15 ACT CGC TAT GGT TAC CTG CCG CCA CCCCAC CCT GCC CAG GCC CAG CTG 498 Thr Arg Tyr Gly Tyr Leu Pro Pro Pro HisPro Ala Gln Ala Gln Leu 20 25 30 CAG AGC CCT GAG AAG TTG CGC GAT GCC ATCAAA GTC ATG CAG AGG TTC 546 Gln Ser Pro Glu Lys Leu Arg Asp Ala Ile LysVal Met Gln Arg Phe 35 40 45 GCG GGG CTG CCG GAG ACC GGC CGC ATG GAC CCAGGG ACA GTG GCC ACC 594 Ala Gly Leu Pro Glu Thr Gly Arg Met Asp Pro GlyThr Val Ala Thr 50 55 60 ATG CGT AAG CCC CGC TGC TCC CTG CCT GAC GTG CTGGGG GTG GCG GGG 642 Met Arg Lys Pro Arg Cys Ser Leu Pro Asp Val Leu GlyVal Ala Gly 65 70 75 CTG GTC AGG CGG CGT CGC CGG TAC GCT CTG AGC GGC AGCGTG TGG AAG 690 Leu Val Arg Arg Arg Arg Arg Tyr Ala Leu Ser Gly Ser ValTrp Lys 80 85 90 95 AAG CGA ACC CTG ACA TGG AGG GTA CGT TCC TTC CCC CAGAGC TCC CAG 738 Lys Arg Thr Leu Thr Trp Arg Val Arg Ser Phe Pro Gln SerSer Gln 100 105 110 CTG AGC CAG GAG ACC GTG CGG GTC CTC ATG AGC TAT GCCCTG ATG GCC 786 Leu Ser Gln Glu Thr Val Arg Val Leu Met Ser Tyr Ala LeuMet Ala 115 120 125 TGG GGC ATG GAG TCA GGC CTC ACA TTT CAT GAG GTG GATTCC CCC CAG 834 Trp Gly Met Glu Ser Gly Leu Thr Phe His Glu Val Asp SerPro Gln 130 135 140 GGC CAG GAG CCC GAC ATC CTC ATC GAC TTT GCC CGC GCCTTC CAC CAG 882 Gly Gln Glu Pro Asp Ile Leu Ile Asp Phe Ala Arg Ala PheHis Gln 145 150 155 GAC AGC TAC CCC TTC GAC GGG TTG GGG GGC ACC CTA GCCCAT GCC TTC 930 Asp Ser Tyr Pro Phe Asp Gly Leu Gly Gly Thr Leu Ala HisAla Phe 160 165 170 175 TTC CCT GGG GAG CAC CCC ATC TCC GGG GAC ACT CACTTT GAC GAT GAG 978 Phe Pro Gly Glu His Pro Ile Ser Gly Asp Thr His PheAsp Asp Glu 180 185 190 GAG ACC TGG ACT TTT GGG TCA AAA GAC GGC GAG GGGACC GAC CTG TTT 1026 Glu Thr Trp Thr Phe Gly Ser Lys Asp Gly Glu Gly ThrAsp Leu Phe 195 200 205 GCC GTG GCT GTC CAT GAG TTT GGC CAC GCC CTG GGCCTG GGC CAC TCC 1074 Ala Val Ala Val His Glu Phe Gly His Ala Leu Gly LeuGly His Ser 210 215 220 TCA GCC CCC AAC TCC ATT ATG AGG CCC TTC TAC CAGGGT CCG GTG GGC 1122 Ser Ala Pro Asn Ser Ile Met Arg Pro Phe Tyr Gln GlyPro Val Gly 225 230 235 GAC CCT GAC AAG TAC CGC CTG TCT CAG GAT GAC CGCGAT GGC CTG CAG 1170 Asp Pro Asp Lys Tyr Arg Leu Ser Gln Asp Asp Arg AspGly Leu Gln 240 245 250 255 CAA CTC TAT GGG AAG GCG CCC CAA ACC CCA TATGAC AAG CCC ACA AGG 1218 Gln Leu Tyr Gly Lys Ala Pro Gln Thr Pro Tyr AspLys Pro Thr Arg 260 265 270 AAA CCC CTG GCT CCT CCG CCC CAG CCC CCG GCCTCG CCC ACA CAC AGC 1266 Lys Pro Leu Ala Pro Pro Pro Gln Pro Pro Ala SerPro Thr His Ser 275 280 285 CCA TCC TTC CCC ATC CCT GAT CGA TGT GAG GGCAAT TTT GAC GCC ATC 1314 Pro Ser Phe Pro Ile Pro Asp Arg Cys Glu Gly AsnPhe Asp Ala Ile 290 295 300 GCC AAC ATC CGA GGG GAA ACT TTC TTC TTC AAAGGC CCC TGG TTC TGG 1362 Ala Asn Ile Arg Gly Glu Thr Phe Phe Phe Lys GlyPro Trp Phe Trp 305 310 315 CGC CTC CAG CCC TCC GGA CAG CTG GTG TCC CCGCGA CCC GCA CGG CTG 1410 Arg Leu Gln Pro Ser Gly Gln Leu Val Ser Pro ArgPro Ala Arg Leu 320 325 330 335 CAC CGC TTC TGG GAG GGG CTG CCC GCC CAGGTG AGG GTG GTG CAG GCC 1458 His Arg Phe Trp Glu Gly Leu Pro Ala Gln ValArg Val Val Gln Ala 340 345 350 GCC TAT GCT CGG CAC CGA GAC GGC CGA ATCCTC CTC TTT AGC GGG CCC 1506 Ala Tyr Ala Arg His Arg Asp Gly Arg Ile LeuLeu Phe Ser Gly Pro 355 360 365 CAG TTC TGG GTG TTC CAG GAC CGG CAG CTGGAG GGC GGG GCG CGG CCG 1554 Gln Phe Trp Val Phe Gln Asp Arg Gln Leu GluGly Gly Ala Arg Pro 370 375 380 CTC ACG GAG CTG GGG CTG CCC CCG GGA GAGGAG GTG GAC GCC GTG TTC 1602 Leu Thr Glu Leu Gly Leu Pro Pro Gly Glu GluVal Asp Ala Val Phe 385 390 395 TCG TGG CCA CAG AAC GGG AAG ACC TAC CTGGTC CGC GGC CGG CAG TAC 1650 Ser Trp Pro Gln Asn Gly Lys Thr Tyr Leu ValArg Gly Arg Gln Tyr 400 405 410 415 TGG CGC TAC GAC GAG GCG GCG GCG CGCCCG GAC CCC GGC TAC CCT CGC 1698 Trp Arg Tyr Asp Glu Ala Ala Ala Arg ProAsp Pro Gly Tyr Pro Arg 420 425 430 GAC CTG AGC CTC TGG GAA GGC GCG CCCCCC TCC CCT GAC GAT GTC ACC 1746 Asp Leu Ser Leu Trp Glu Gly Ala Pro ProSer Pro Asp Asp Val Thr 435 440 445 GTC AGC AAC GCA GGT GAC ACC TAC TTCTTC AAG GGC GCC CAC TAC TGG 1794 Val Ser Asn Ala Gly Asp Thr Tyr Phe PheLys Gly Ala His Tyr Trp 450 455 460 CGC TTC CCC AAG AAC AGC ATC AAG ACCGAG CCG GAC GCC CCC CAG CCC 1842 Arg Phe Pro Lys Asn Ser Ile Lys Thr GluPro Asp Ala Pro Gln Pro 465 470 475 ATG GGG CCC AAC TGG CTG GAC TGC CCCGCC CCG AGC TCT GGT CCC CGC 1890 Met Gly Pro Asn Trp Leu Asp Cys Pro AlaPro Ser Ser Gly Pro Arg 480 485 490 495 GCC CCC AGG CCC CCC AAA GCG ACCCCC GTG TCC GAA ACC TGC GAT TGT 1938 Ala Pro Arg Pro Pro Lys Ala Thr ProVal Ser Glu Thr Cys Asp Cys 500 505 510 CAG TGC GAG CTC AAC CAG GCC GCAGGA CGT TGG CCT GCT CCC ATC CCG 1986 Gln Cys Glu Leu Asn Gln Ala Ala GlyArg Trp Pro Ala Pro Ile Pro 515 520 525 CTG CTC CTC TTG CCC CTG CTG GTGGGG GGT GTA GCC TCC CGC 2028 Leu Leu Leu Leu Pro Leu Leu Val Gly Gly ValAla Ser Arg 530 535 540 TGATGGGGGG AGCCATCCAG ACCGAACAGC GCCCTCCACGGCCGAGTCCC CCGCCGCTGG 2088 ACCTGGTCGG GGGTTGTGAG GCGCTGCGGA GGCCCCTTGTCTGTTCCCAC GGACGGGGGC 2148 TCGGGCGCGG ACTAAGCAGG GGGGATCTCC CGCGCAGGGGCGGCGGCGGC GGGGACCGGT 2208 CGCCTGGCGC TGGGCTCAGT CTCCTCAGGG TCTGAGACCCCGGCGCTGCC ACCGGAACCC 2268 GCCTTCAGGG GCGCACGCGC GCTGGGACCA TGCGTCGGTCGTCGCCCCCG TCGTTCCCTC 2328 CCGGCTGCCG CCAGGGGGCG GTCGGACCCC GCCTCCCGAGCCCGGGGAGG GGCGGGGAGG 2388 ACAAGGGGCG GGCCCGCGGC CTCACCCGGA GGGACGGCAGCCCCGGTCGC GCGCTGGCCC 2448 CGCAGGACCT TCCTTTTCCA GGAAGAGCCA GCTTTTCTCGGAGCGCAGTC CTGGGACTCT 2508 CCGCAGCCCC GCCCCGCCTG GCCACTGCGT CTGGCATTCCTGGGTCGTTA GAGGACAGGC 2568 CTGACTGCGA AGCTGTGCCT TGCCCCTCTC CCACCCGCAGTTTCTCACCC CGTTCTGCTC 2628 CCACAAGGCC CCCCTACAGT CACTGCCACA CTGGTGGGGACCTGGGACCC AGACCCGGAA 2688 CCAGCCCAGA TATCACCCCT GAGGACCCAT GCGCCACGTCCTGGGTGGTG GAATCAGTGG 2748 CTGGAGGGAC GACCCTTGCT CTCCAGGCTG TTAACCTTTTCCGTTGCTCC CCCGCCACCC 2808 ACCTCCTCCT CCCCAGGCCA CCCAACTTGG GCACCTCCCTGGGCCCAGAA CTGCCTTCCA 2868 TTCAATGGGG AACCCTTCTA TCCCCAAGAA CCCCTTCCCTGCTTGCACCC TGGAGAGAAC 2928 AGCTTGACTC CCATCAACTC AACGCTGGTG GAAAGACAGGGACCGAACCC TGGCTCAGGC 2988 CTGGTCATTG CCTCCTCAGC ACTCCCTCCT GGGAGGCCTTAGCTCTAGAG TGAGGGGTGG 3048 GTGGAACCTG GGGGCACCTC GTTCACCCTG TCCCCACTCCCCACAGTTTT AGGATCTAAA 3108 TGATTGCCTC TGGAACTATT CTTCTAGACT ATCCCACATCAGAATCACTG GGAAATTTAA 3168 GTTTGCAGAT CCCACACTCA CCCTGAATCC TCACTCAGGGTGGGGTCAGG AATCTGCATT 3228 TTAACTAGTC GCGGGGATTG TGGGGGGCAG TAGCTGGCTGTTTCGTGGCA TTTCTGTGGC 3288 TCTGCAGTGT TCCTCCACCC CAGGACCAAT ATGTTCAGGCCACACCGATG GCCTGAACCC 3348 CATGGGTAGA GTCACTTAGG GGCCACTTCC TAAGTTGCTGTCCAGCCTCA GTGACCCCCT 3408 AGTGCTTCCT GGAGCTGAGG CTGTGGGCGG CTGTCCCAGCAACCAWGCGA GGGGTTGCCC 3468 CAGTTGCTCA TACAAACAGA TCAGCATGAG GACAGAAGGCAGGAGACTTT GGTCAGTTAC 3528 CTGGGAATTC TGGGCTGCCA GGAAACGATT TGGGCCTCTGTCAGTTTCTT TTCCATGTAT 3588 GAGGAGGGGG AAATTTGTAT ATTAGATACT TATTCATCCCACTCTGGACA ATAAAAACGA 3648 ATGTACAAAA AAAACATAAA AAAAAAAAAT AAAGAAAATCAAA 3691 562 amino acids amino acid linear protein not provided 4 MetArg Leu Arg Leu Arg Leu Leu Ala Leu Leu Leu Leu Leu Leu Ala -21 -20 -15-10 Pro Pro Ala Arg Ala Pro Lys Pro Ser Ala Gln Asp Val Ser Leu Gly -5 15 10 Val Asp Trp Leu Thr Arg Tyr Gly Tyr Leu Pro Pro Pro His Pro Ala 1520 25 Gln Ala Gln Leu Gln Ser Pro Glu Lys Leu Arg Asp Ala Ile Lys Val 3035 40 Met Gln Arg Phe Ala Gly Leu Pro Glu Thr Gly Arg Met Asp Pro Gly 4550 55 Thr Val Ala Thr Met Arg Lys Pro Arg Cys Ser Leu Pro Asp Val Leu 6065 70 75 Gly Val Ala Gly Leu Val Arg Arg Arg Arg Arg Tyr Ala Leu Ser Gly80 85 90 Ser Val Trp Lys Lys Arg Thr Leu Thr Trp Arg Val Arg Ser Phe Pro95 100 105 Gln Ser Ser Gln Leu Ser Gln Glu Thr Val Arg Val Leu Met SerTyr 110 115 120 Ala Leu Met Ala Trp Gly Met Glu Ser Gly Leu Thr Phe HisGlu Val 125 130 135 Asp Ser Pro Gln Gly Gln Glu Pro Asp Ile Leu Ile AspPhe Ala Arg 140 145 150 155 Ala Phe His Gln Asp Ser Tyr Pro Phe Asp GlyLeu Gly Gly Thr Leu 160 165 170 Ala His Ala Phe Phe Pro Gly Glu His ProIle Ser Gly Asp Thr His 175 180 185 Phe Asp Asp Glu Glu Thr Trp Thr PheGly Ser Lys Asp Gly Glu Gly 190 195 200 Thr Asp Leu Phe Ala Val Ala ValHis Glu Phe Gly His Ala Leu Gly 205 210 215 Leu Gly His Ser Ser Ala ProAsn Ser Ile Met Arg Pro Phe Tyr Gln 220 225 230 235 Gly Pro Val Gly AspPro Asp Lys Tyr Arg Leu Ser Gln Asp Asp Arg 240 245 250 Asp Gly Leu GlnGln Leu Tyr Gly Lys Ala Pro Gln Thr Pro Tyr Asp 255 260 265 Lys Pro ThrArg Lys Pro Leu Ala Pro Pro Pro Gln Pro Pro Ala Ser 270 275 280 Pro ThrHis Ser Pro Ser Phe Pro Ile Pro Asp Arg Cys Glu Gly Asn 285 290 295 PheAsp Ala Ile Ala Asn Ile Arg Gly Glu Thr Phe Phe Phe Lys Gly 300 305 310315 Pro Trp Phe Trp Arg Leu Gln Pro Ser Gly Gln Leu Val Ser Pro Arg 320325 330 Pro Ala Arg Leu His Arg Phe Trp Glu Gly Leu Pro Ala Gln Val Arg335 340 345 Val Val Gln Ala Ala Tyr Ala Arg His Arg Asp Gly Arg Ile LeuLeu 350 355 360 Phe Ser Gly Pro Gln Phe Trp Val Phe Gln Asp Arg Gln LeuGlu Gly 365 370 375 Gly Ala Arg Pro Leu Thr Glu Leu Gly Leu Pro Pro GlyGlu Glu Val 380 385 390 395 Asp Ala Val Phe Ser Trp Pro Gln Asn Gly LysThr Tyr Leu Val Arg 400 405 410 Gly Arg Gln Tyr Trp Arg Tyr Asp Glu AlaAla Ala Arg Pro Asp Pro 415 420 425 Gly Tyr Pro Arg Asp Leu Ser Leu TrpGlu Gly Ala Pro Pro Ser Pro 430 435 440 Asp Asp Val Thr Val Ser Asn AlaGly Asp Thr Tyr Phe Phe Lys Gly 445 450 455 Ala His Tyr Trp Arg Phe ProLys Asn Ser Ile Lys Thr Glu Pro Asp 460 465 470 475 Ala Pro Gln Pro MetGly Pro Asn Trp Leu Asp Cys Pro Ala Pro Ser 480 485 490 Ser Gly Pro ArgAla Pro Arg Pro Pro Lys Ala Thr Pro Val Ser Glu 495 500 505 Thr Cys AspCys Gln Cys Glu Leu Asn Gln Ala Ala Gly Arg Trp Pro 510 515 520 Ala ProIle Pro Leu Leu Leu Leu Pro Leu Leu Val Gly Gly Val Ala 525 530 535 SerArg 540 28 base pairs nucleic acid single linear other nucleic acid/desc = “primer” not provided 5 CAGAAATGCC ACGAAACAGC CAGGTACT 28 28base pairs nucleic acid single linear other nucleic acid /desc =“primer” not provided 6 GCCCCAGTTG CTCATACAAA CAGATCAG 28 519 aminoacids amino acid not relevant linear peptide not provided 7 Met Gln GlnPhe Gly Gly Leu Glu Ala Thr Gly Ile Leu Asp Glu Ala 1 5 10 15 Thr LeuAla Leu Met Lys Thr Pro Arg Cys Ser Leu Pro Asp Leu Pro 20 25 30 Val LeuThr Gln Ala Arg Arg Arg Arg Gln Ala Pro Ala Pro Thr Lys 35 40 45 Trp AsnLys Arg Asn Leu Ser Trp Arg Val Arg Thr Phe Pro Arg Asp 50 55 60 Ser ProLeu Gly His Asp Thr Val Arg Ala Leu Met Tyr Tyr Ala Leu 65 70 75 80 LysVal Trp Ser Asp Ile Ala Pro Leu Asn Phe His Glu Val Ala Gly 85 90 95 SerThr Ala Asp Ile Gln Ile Asp Phe Ser Lys Ala Asp His Asn Asp 100 105 110Gly Tyr Pro Phe Asp Gly Pro Gly Gly Thr Val Ala His Ala Phe Phe 115 120125 Pro Gly His His His Thr Ala Gly Asp Thr His Phe Asp Asp Asp Glu 130135 140 Ala Trp Thr Phe Arg Ser Ser Asp Ala His Gly Met Asp Leu Phe Ala145 150 155 160 Val Ala Val His Glu Phe Gly His Ala Ile Gly Leu Ser HisVal Ala 165 170 175 Ala Ala His Ser Ile Met Arg Pro Tyr Tyr Gln Gly ProVal Gly Asp 180 185 190 Pro Leu Arg Tyr Gly Leu Pro Tyr Glu Asp Lys ValArg Val Trp Gln 195 200 205 Leu Tyr Gly Val Arg Glu Ser Val Ser Pro ThrAla Gln Pro Glu Glu 210 215 220 Pro Pro Leu Leu Pro Glu Pro Pro Asp AsnArg Ser Ser Ala Pro Pro 225 230 235 240 Arg Lys Asp Val Pro His Arg CysSer Thr His Phe Asp Ala Val Ala 245 250 255 Gln Ile Arg Gly Glu Ala PhePhe Phe Lys Gly Lys Tyr Phe Trp Arg 260 265 270 Leu Thr Arg Asp Arg HisLeu Val Ser Leu Gln Pro Ala Gln Met His 275 280 285 Arg Phe Trp Arg GlyLeu Pro Leu His Leu Asp Ser Val Asp Ala Val 290 295 300 Tyr Glu Arg ThrSer Asp His Lys Ile Val Phe Phe Lys Gly Asp Arg 305 310 315 320 Tyr TrpVal Phe Lys Asp Asn Asn Val Glu Glu Gly Tyr Pro Arg Pro 325 330 335 ValSer Asp Phe Ser Leu Pro Pro Gly Gly Ile Asp Ala Ala Phe Ser 340 345 350Trp Ala His Asn Asp Arg Thr Tyr Phe Phe Lys Asp Gln Leu Tyr Trp 355 360365 Arg Tyr Asp Asp His Thr Arg His Met Asp Pro Gly Tyr Pro Ala Gln 370375 380 Ser Pro Leu Trp Arg Gly Val Pro Ser Thr Leu Asp Asp Ala Met Arg385 390 395 400 Trp Ser Asp Gly Ala Ser Tyr Phe Phe Arg Gly Gln Glu TyrTrp Lys 405 410 415 Val Leu Asp Gly Glu Leu Glu Val Ala Pro Gly Tyr ProGln Ser Thr 420 425 430 Ala Arg Asp Trp Leu Val Cys Gly Asp Ser Gln AlaAsp Gly Ser Val 435 440 445 Ala Ala Gly Val Asp Ala Ala Glu Gly Pro ArgAla Pro Pro Gly Gln 450 455 460 His Asp Gln Ser Arg Ser Glu Asp Gly TyrGlu Val Cys Ser Cys Thr 465 470 475 480 Ser Gly Ala Ser Ser Pro Pro GlyAla Pro Gly Pro Leu Val Ala Ala 485 490 495 Thr Met Leu Leu Leu Leu ProPro Leu Ser Pro Gly Ala Leu Trp Thr 500 505 510 Ala Ala Gln Ala Leu ThrLeu 515 564 amino acids amino acid not relevant linear peptide notprovided 8 Met Lys Arg Pro Arg Cys Gly Val Pro Asp Gln Phe Gly Val ArgVal 1 5 10 15 Lys Ala Asn Leu Arg Arg Arg Arg Lys Arg Tyr Ala Leu ThrGly Arg 20 25 30 Lys Trp Asn Asn His His Leu Thr Phe Ser Ile Gln Asn TyrThr Glu 35 40 45 Lys Leu Gly Trp Tyr His Ser Met Glu Ala Val Arg Arg AlaPhe Arg 50 55 60 Val Trp Glu Gln Ala Thr Pro Leu Val Phe Gln Glu Val ProTyr Glu 65 70 75 80 Asp Ile Arg Leu Arg Arg Gln Lys Glu Ala Asp Ile MetVal Leu Phe 85 90 95 Ala Ser Gly Phe His Gly Asp Ser Ser Pro Phe Asp GlyThr Gly Gly 100 105 110 Phe Leu Ala His Ala Tyr Phe Pro Gly Pro Gly LeuGly Gly Asp Thr 115 120 125 His Phe Asp Ala Asp Glu Pro Trp Thr Phe SerSer Thr Asp Leu His 130 135 140 Gly Asn Asn Leu Phe Leu Val Ala Val HisGlu Leu Gly His Ala Leu 145 150 155 160 Gly Leu Glu His Ser Ser Asn ProAsn Ala Ile Met Ala Pro Phe Tyr 165 170 175 Gln Trp Lys Asp Val Asp AsnPhe Lys Leu Pro Glu Asp Asp Leu Arg 180 185 190 Gly Ile Gln Gln Leu TyrGly Thr Pro Asp Gly Gln Pro Gln Pro Thr 195 200 205 Gln Pro Leu Pro ThrVal Thr Pro Arg Arg Pro Gly Arg Pro Asp His 210 215 220 Arg Pro Pro ArgPro Pro Gln Pro Pro Pro Pro Gly Gly Lys Pro Glu 225 230 235 240 Arg ProPro Lys Pro Gly Pro Pro Val Gln Pro Arg Ala Thr Glu Arg 245 250 255 ProAsp Gln Tyr Gly Pro Asn Ile Cys Asp Gly Asp Phe Asp Thr Val 260 265 270Ala Met Leu Arg Gly Glu Met Phe Val Phe Lys Gly Arg Trp Phe Trp 275 280285 Arg Val Arg His Asn Arg Val Leu Asp Asn Tyr Pro Met Pro Ile Gly 290295 300 His Phe Trp Arg Gly Leu Pro Gly Asp Ile Ser Ala Ala Tyr Glu Arg305 310 315 320 Gln Asp Gly Arg Phe Val Phe Phe Lys Gly Asp Arg Tyr TrpLeu Phe 325 330 335 Arg Glu Ala Asn Leu Glu Pro Gly Tyr Pro Gln Pro LeuThr Ser Tyr 340 345 350 Gly Leu Gly Ile Pro Tyr Asp Arg Ile Asp Thr AlaIle Trp Trp Glu 355 360 365 Pro Thr Gly His Thr Phe Phe Phe Gln Glu AspArg Tyr Trp Arg Phe 370 375 380 Asn Glu Glu Thr Gln Arg Gly Asp Pro GlyTyr Pro Lys Pro Ile Ser 385 390 395 400 Val Trp Gln Gly Ile Pro Ala SerPro Lys Gly Ala Phe Leu Ser Asn 405 410 415 Asp Ala Ala Tyr Thr Tyr PheTyr Lys Gly Thr Lys Tyr Trp Lys Phe 420 425 430 Asp Asn Glu Arg Leu ArgMet Glu Pro Gly Tyr Pro Lys Ser Ile Leu 435 440 445 Arg Asp Phe Met GlyCys Gln Glu His Val Glu Pro Gly Pro Arg Trp 450 455 460 Pro Asp Val AlaArg Pro Pro Phe Asn Pro His Gly Gly Ala Glu Pro 465 470 475 480 Gly AlaAsp Ser Ala Glu Gly Asp Val Gly Asp Gly Asp Gly Asp Phe 485 490 495 GlyAla Gly Val Asn Lys Asp Arg Gly Ser Arg Val Val Val Gln Met 500 505 510Glu Glu Val Ala Arg Thr Val Asn Val Val Met Val Leu Val Pro Leu 515 520525 Leu Leu Leu Leu Cys Val Leu Gly Leu Thr Tyr Ala Leu Val Gln Met 530535 540 Gln Arg Lys Gly Ala Pro Arg Val Leu Leu Tyr Cys Lys Arg Ser Leu545 550 555 560 Gln Glu Trp Val 582 amino acids amino acid not relevantlinear peptide not provided 9 Met Ser Pro Ala Pro Arg Pro Ser Arg CysLeu Leu Leu Pro Leu Leu 1 5 10 15 Thr Leu Gly Thr Ala Leu Ala Ser LeuGly Ser Ala Gln Ser Ser Ser 20 25 30 Phe Ser Pro Glu Ala Trp Leu Gln GlnTyr Gly Tyr Leu Pro Pro Gly 35 40 45 Asp Leu Arg Thr His Thr Gln Arg SerPro Gln Ser Leu Ser Ala Ala 50 55 60 Ile Ala Ala Met Gln Lys Phe Tyr GlyLeu Gln Val Thr Gly Lys Ala 65 70 75 80 Asp Ala Asp Thr Met Lys Ala MetArg Arg Pro Arg Cys Gly Val Pro 85 90 95 Asp Lys Phe Gly Ala Glu Ile LysAla Asn Val Arg Arg Lys Arg Tyr 100 105 110 Ala Ile Gln Gly Leu Lys TrpGln His Asn Glu Ile Thr Phe Cys Ile 115 120 125 Gln Asn Tyr Thr Pro LysVal Gly Glu Tyr Ala Thr Tyr Glu Ala Ile 130 135 140 Arg Lys Ala Phe ArgVal Trp Glu Ser Ala Thr Pro Leu Arg Phe Arg 145 150 155 160 Glu Val ProTyr Ala Tyr Ile Arg Glu Gly His Glu Lys Gln Ala Asp 165 170 175 Ile MetIle Phe Phe Ala Glu Gly Phe His Gly Asp Ser Thr Pro Phe 180 185 190 AspGly Glu Gly Gly Phe Leu Ala His Ala Tyr Phe Pro Gly Pro Asn 195 200 205Ile Gly Gly Asp Thr His Phe Asp Ser Ala Glu Pro Trp Thr Val Arg 210 215220 Asn Glu Asp Leu Asn Gly Asn Asp Ile Phe Leu Val Ala Val His Glu 225230 235 240 Leu Gly His Ala Leu Gly Leu Glu His Ser Ser Asp Pro Ser AlaIle 245 250 255 Met Ala Pro Phe Tyr Gln Trp Met Asp Thr Glu Asn Phe ValLeu Pro 260 265 270 Asp Asp Asp Arg Arg Gly Ile Gln Gln Leu Tyr Gly GlyGlu Ser Gly 275 280 285 Phe Pro Thr Lys Met Pro Pro Gln Pro Arg Thr ThrSer Arg Pro Ser 290 295 300 Val Pro Asp Lys Pro Lys Asn Pro Thr Tyr GlyPro Asn Ile Cys Asp 305 310 315 320 Gly Asn Phe Asp Thr Val Ala Met LeuArg Gly Glu Met Phe Val Phe 325 330 335 Lys Glu Arg Trp Phe Trp Arg ValArg Asn Asn Gln Val Met Asp Gly 340 345 350 Tyr Pro Met Pro Ile Gly GlnPhe Trp Arg Gly Leu Pro Ala Ser Ile 355 360 365 Asn Thr Ala Tyr Glu ArgLys Asp Gly Lys Phe Val Phe Phe Lys Gly 370 375 380 Asp Lys His Trp ValPhe Asp Glu Ala Ser Leu Glu Pro Gly Tyr Pro 385 390 395 400 Lys His IleLys Glu Leu Gly Arg Gly Leu Pro Thr Asp Lys Ile Asp 405 410 415 Ala AlaLeu Phe Trp Met Pro Asn Gly Lys Thr Tyr Phe Phe Arg Gly 420 425 430 AsnLys Tyr Tyr Arg Phe Asn Glu Glu Leu Arg Ala Val Asp Ser Glu 435 440 445Tyr Pro Lys Asn Ile Lys Val Trp Glu Gly Ile Pro Glu Ser Pro Arg 450 455460 Gly Ser Phe Met Gly Ser Asp Glu Val Phe Thr Tyr Phe Tyr Lys Gly 465470 475 480 Asn Lys Tyr Trp Lys Phe Asn Asn Gln Lys Leu Lys Val Glu ProGly 485 490 495 Tyr Pro Lys Ser Ala Leu Arg Asp Trp Met Gly Cys Pro SerGly Gly 500 505 510 Arg Pro Asp Glu Gly Thr Glu Glu Glu Thr Glu Val IleIle Ile Glu 515 520 525 Val Asp Glu Glu Gly Gly Gly Ala Val Ser Ala AlaAla Val Val Leu 530 535 540 Pro Val Leu Leu Leu Leu Leu Val Leu Ala ValGly Leu Ala Val Phe 545 550 555 560 Phe Phe Arg Arg His Gly Thr Pro ArgArg Leu Leu Tyr Cys Gln Arg 565 570 575 Ser Leu Leu Asp Lys Val 580 607amino acids amino acid not relevant linear peptide not provided 10 MetIle Leu Leu Thr Phe Ser Thr Gly Arg Arg Leu Asp Phe Val His 1 5 10 15His Ser Gly Val Phe Phe Leu Gln Thr Leu Leu Trp Ile Leu Cys Ala 20 25 30Thr Val Cys Gly Thr Glu Gln Tyr Phe Asn Val Glu Val Trp Leu Gln 35 40 45Lys Tyr Gly Tyr Leu Pro Pro Thr Asp Pro Arg Met Ser Val Leu Arg 50 55 60Ser Ala Glu Thr Met Gln Ser Ala Leu Ala Ala Met Gln Gln Phe Tyr 65 70 7580 Gly Ile Asn Met Thr Gly Lys Val Asp Arg Asn Thr Ile Asp Trp Met 85 9095 Lys Lys Pro Arg Cys Gly Val Pro Asp Gln Thr Arg Gly Ser Ser Lys 100105 110 Phe His Ile Arg Arg Lys Arg Tyr Ala Leu Thr Gly Gln Lys Trp Gln115 120 125 His Lys His Ile Thr Tyr Ser Ile Lys Asn Val Thr Pro Lys ValGly 130 135 140 Asp Pro Glu Thr Arg Lys Ala Ile Arg Arg Ala Phe Asp ValTrp Gln 145 150 155 160 Asn Val Thr Pro Leu Thr Phe Glu Glu Val Pro TyrSer Glu Leu Glu 165 170 175 Asn Gly Lys Arg Asp Val Asp Ile Thr Ile IlePhe Ala Ser Gly Phe 180 185 190 His Gly Asp Ser Ser Pro Phe Asp Gly GluGly Gly Phe Leu Ala His 195 200 205 Ala Tyr Phe Pro Gly Pro Gly Ile GlyGly Asp Thr His Phe Asp Ser 210 215 220 Asp Glu Pro Trp Thr Leu Gly AsnPro Asn His Asp Gly Asn Asp Leu 225 230 235 240 Phe Leu Val Ala Val HisGlu Leu Gly His Ala Leu Gly Leu Glu His 245 250 255 Ser Asn Asp Pro ThrAla Ile Met Ala Pro Phe Tyr Gln Tyr Met Glu 260 265 270 Thr Asp Asn PheLys Leu Pro Asn Asp Asp Leu Gln Gly Ile Gln Lys 275 280 285 Ile Tyr GlyPro Pro Asp Lys Ile Pro Pro Pro Thr Arg Pro Leu Pro 290 295 300 Thr ValPro Pro His Arg Ser Ile Pro Pro Ala Asp Pro Arg Lys Asn 305 310 315 320Asp Arg Pro Lys Pro Pro Arg Pro Pro Thr Gly Arg Pro Ser Tyr Pro 325 330335 Gly Ala Lys Pro Asn Ile Cys Asp Gly Asn Phe Asn Thr Leu Ala Ile 340345 350 Leu Arg Arg Glu Met Phe Val Phe Lys Asp Gln Trp Phe Trp Arg Val355 360 365 Arg Asn Asn Arg Val Met Asp Gly Tyr Pro Met Gln Ile Thr TyrPhe 370 375 380 Trp Arg Gly Leu Pro Pro Ser Ile Asp Ala Val Tyr Glu AsnSer Asp 385 390 395 400 Gly Asn Phe Val Phe Phe Lys Gly Asn Lys Tyr TrpVal Phe Lys Asp 405 410 415 Thr Thr Leu Gln Pro Gly Tyr Pro His Asp LeuIle Thr Leu Gly Ser 420 425 430 Gly Ile Pro Pro His Gly Ile Asp Ser AlaIle Trp Trp Glu Asp Val 435 440 445 Gly Lys Thr Tyr Phe Phe Lys Gly AspArg Tyr Trp Arg Tyr Ser Glu 450 455 460 Glu Met Lys Thr Met Asp Pro GlyTyr Pro Lys Pro Ile Thr Val Trp 465 470 475 480 Lys Gly Ile Pro Glu SerPro Gln Gly Ala Phe Val His Lys Glu Asn 485 490 495 Gly Phe Thr Tyr PheTyr Lys Gly Lys Glu Tyr Trp Lys Phe Asn Asn 500 505 510 Gln Ile Leu LysVal Glu Pro Gly Tyr Pro Arg Ser Ile Leu Lys Asp 515 520 525 Phe Met GlyCys Asp Gly Pro Thr Asp Arg Val Lys Glu Gly His Ser 530 535 540 Pro ProAsp Asp Val Asp Ile Val Ile Lys Leu Asp Asn Thr Ala Ser 545 550 555 560Thr Val Lys Ala Ile Ala Ile Val Ile Pro Cys Ile Leu Ala Leu Cys 565 570575 Leu Leu Val Leu Val Tyr Thr Val Phe Gln Phe Lys Arg Lys Gly Thr 580585 590 Pro Arg His Ile Leu Tyr Cys Lys Arg Ser Met Gln Glu Trp Val 595600 605

What is claimed is:
 1. A substantially pure or isolated polypeptidecomprising: (a) an amino acid sequence which is at least 65% identicalto the mature sequence of SEQ ID NO: 2 or (b) an amino acid sequencewhich is at least 65% identical to the mature sequence of SEQ ID NO: 4;wherein said polypeptide has proteolytic activity.
 2. The polypeptide ofclaim 1, comprising an amino acid sequence which is at least 65%identical to the mature sequence of SEQ ID NO:2.
 3. The polypeptide ofclaim 2, comprising an amino acid sequence which is at least 80%identical to the mature sequence of SEQ ID NO:2.
 4. The polypeptide ofclaim 3, comprising an amino acid sequence which is at least 95%identical to the mature sequence of SEQ ID NO:2.
 5. The polypeptide ofclaim 1, comprising an amino acid sequence which is at least 65%identical to the mature sequence of SEQ ID NO:4.
 6. The polypeptide ofclaim 5, comprising an amino acid sequence which is at least 80%identical to the mature sequence of SEQ ID NO:4.
 7. The polypeptide ofclaim 6, comprising an amino acid sequence which is at least 95%identical to the mature sequence of SEQ ID NO:4.
 8. A polypeptide,comprising an amino acid sequence that is identical to the maturesequence of SEQ ID NO:
 2. 9. The polypeptide of claim 1, furthercomprising: (a) a detection or purification tag selected from the groupconsisting of FLAG, His6, and Ig; or (b) sequence of a heterologousprotein.
 10. A polypeptide, comprising an amino acid sequence that isidentical to the mature sequence of SEQ ID NO:
 4. 11. The polypeptide ofclaim 1 that is a synthetic polypeptide.
 12. A sterile compositioncomprising: (a) the polypeptide of claim 1 that is sterile; or (b) saidpolypeptide of claim 1 and a suitable pharmaceutical carrier, whereinsaid carrier is formulated for oral, rectal, nasal, topical, orparenteral administration.
 13. A kit comprising a polypeptide of claim1, and: (a) a compartment comprising said polypeptide; and/or (b)instructions for use or disposal of reagent in said kit.